From d78285f5a5e0fed328e40a3a3ee9cbff17fad63b Mon Sep 17 00:00:00 2001
From: auphelia <jakobapk@web.de>
Date: Fri, 28 Feb 2020 16:59:04 +0000
Subject: [PATCH] [Notebook] Add verification notebook for end-to-end flow

---
 .../tfc_end2end_verification.ipynb            | 511 ++++++++++++++++++
 1 file changed, 511 insertions(+)
 create mode 100644 notebooks/end2end_example/tfc_end2end_verification.ipynb

diff --git a/notebooks/end2end_example/tfc_end2end_verification.ipynb b/notebooks/end2end_example/tfc_end2end_verification.ipynb
new file mode 100644
index 000000000..7b045106d
--- /dev/null
+++ b/notebooks/end2end_example/tfc_end2end_verification.ipynb
@@ -0,0 +1,511 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# FINN - Functional Verification of End-to-End Flow\n",
+    "-----------------------------------------------------------------\n",
+    "\n",
+    "**Important: This notebook depends on the tfc_end2end_example notebook, because we are using models that are available at intermediate steps in the end-to-end flow. So please make sure the needen .onnx files are generated to run this notebook.**\n",
+    "\n",
+    "In this notebook, we will show how to take the intermediate results of the end-to-end tfc example and verify their functionality with different methods. In the following picture you can see the block in the end-to-end flow about the *Simulation & Emulation flows for functional verification*. Besides the methods in this notebook, there is another one that is covered in the Jupyter notebook [tfc_end2end_example](tfc_end2end_example.ipynb): remote execution. The remote execution allows functional verification directly on the PYNQ board, for details please have a look at the mentioned Jupyter notebook."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "![](verification.png)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We will use the following helper functions, `showSrc` to show source code of FINN library calls and `showInNetron` to show the ONNX model at the current transformation step. The Netron displays are interactive, but they only work when running the notebook actively and not on GitHub (i.e. if you are viewing this on GitHub you'll only see blank squares)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import inspect\n",
+    "import netron\n",
+    "from finn.util.basic import make_build_dir\n",
+    "from IPython.display import IFrame\n",
+    "\n",
+    "def showSrc(what):\n",
+    "    print(\"\".join(inspect.getsourcelines(what)[0]))\n",
+    "    \n",
+    "def showInNetron(model_filename):\n",
+    "    netron.start(model_filename, port=8081, host=\"0.0.0.0\")\n",
+    "    return IFrame(src=\"http://0.0.0.0:8081/\", width=\"100%\", height=400)\n",
+    "    \n",
+    "build_dir = \"/workspace/finn\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "To verify the simulations a \"golden\" output is calculated as a reference. This is calculated directly from the Brevitas model using torch."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 22,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "array([[-0.4992097 , -0.24960485,  6.489726  ,  0.99841946, -0.24960482,\n",
+       "        -2.2464437 ,  0.7488146 , -1.4976292 , -0.49920973, -2.7456534 ]],\n",
+       "      dtype=float32)"
+      ]
+     },
+     "execution_count": 22,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from pkgutil import get_data\n",
+    "import onnx\n",
+    "import onnx.numpy_helper as nph\n",
+    "import torch\n",
+    "from finn.util.test import get_test_model_trained\n",
+    "\n",
+    "fc = get_test_model_trained(\"TFC\", 1, 1)\n",
+    "raw_i = get_data(\"finn\", \"data/onnx/mnist-conv/test_data_set_0/input_0.pb\")\n",
+    "input_tensor = onnx.load_tensor_from_string(raw_i)\n",
+    "input_brevitas = torch.from_numpy(nph.to_array(input_tensor)).float()\n",
+    "output_golden = fc.forward(input_brevitas).detach().numpy()\n",
+    "output_golden"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Simulation using Python \n",
+    "\n",
+    "If an ONNX model consists of [standard ONNX](https://github.com/onnx/onnx/blob/master/docs/Operators.md) nodes and/or FINN custom operations that do not belong to the fpgadataflow (backend $\\neq$ \"fpgadataflow\") this model can be checked for functionality using Python. General information about FINN custom op nodes can be found in Jupyter notebook [2_custom_op.ipynb](../internals/2_custom_op.ipynb).\n",
+    "\n",
+    "To simulate a standard ONNX node [ONNX Runtime](https://github.com/microsoft/onnxruntime) is used. ONNX Runtime is an open source tool developed by Microsoft to run standard ONNX nodes. For the FINN custom op nodes execution functions are defined. The following is an example of the execution function of a XNOR popcount node.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 23,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "def xnorpopcountmatmul(inp0, inp1):\n",
+      "    \"\"\"Simulates XNOR-popcount matrix multiplication as a regular bipolar\n",
+      "    matrix multiplication followed by some post processing.\"\"\"\n",
+      "    # extract the operand shapes\n",
+      "    (M, K0) = inp0.shape\n",
+      "    (K1, N) = inp1.shape\n",
+      "    # make sure shapes are compatible with matmul\n",
+      "    assert K0 == K1, \"Matrix shapes are not compatible with matmul.\"\n",
+      "    K = K0\n",
+      "    # convert binary inputs to bipolar\n",
+      "    inp0_bipolar = 2.0 * inp0 - 1.0\n",
+      "    inp1_bipolar = 2.0 * inp1 - 1.0\n",
+      "    # call regular numpy matrix multiplication\n",
+      "    out = np.matmul(inp0_bipolar, inp1_bipolar)\n",
+      "    # XNOR-popcount does not produce the regular dot product result --\n",
+      "    # it returns the number of +1s after XNOR. let P be the number of +1s\n",
+      "    # and N be the number of -1s. XNOR-popcount returns P, whereas the\n",
+      "    # regular dot product result from numpy is P-N, so we need to apply\n",
+      "    # some correction.\n",
+      "    # out = P-N\n",
+      "    # K = P+N\n",
+      "    # out + K = 2P, so P = (out + K)/2\n",
+      "    return (out + K) * 0.5\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "from finn.custom_op.xnorpopcount import xnorpopcountmatmul\n",
+    "showSrc(xnorpopcountmatmul)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The function contains a description of the behaviour in Python and can thus calculate the result of the node.\n",
+    "\n",
+    "This execution function and onnxruntime is used when `execute_onnx` from `onnx_exec` is applied to the model. The model is then simulated node by node and the result is stored in a context dictionary, which contains the values of each tensor at the end of the execution. To get the result, only the output tensor has to be extracted.\n",
+    "\n",
+    "The procedure is shown below. We take the model right before the nodes should be converted into HLS layers and generate an input tensor to pass to the execution function. The input tensor is generated from the Brevitas example inputs."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 27,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "from finn.core.modelwrapper import ModelWrapper\n",
+    "input_dict = {\"global_in\": nph.to_array(input_tensor)}\n",
+    "\n",
+    "model_for_sim = ModelWrapper(build_dir+\"/tfc_w1a1_ready_for_hls_conversion.onnx\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 28,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import finn.core.onnx_exec as oxe\n",
+    "output_dict = oxe.execute_onnx(model_for_sim, input_dict)\n",
+    "output_pysim = output_dict[list(output_dict.keys())[0]]\n",
+    "assert np.isclose(output_pysim, output_golden, atol=1e-3).all(), \"The results are not the same!\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The result is compared with the theoretical \"golden\" value for verification."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Simulation (npysim) using C++\n",
+    "\n",
+    "When dealing with HLS custom op nodes in FINN the simulation using Python is no longer sufficient. After the nodes have been converted to HLS layers, the simulation using C++ can be used. To do this, the input tensor is stored in an .npy file and C++ code is generated that reads the values from the .npy array, streams them to the corresponding finn-hlslib function and writes the result to a new .npy file. This in turn can be read in Python and processed in the FINN flow. For this example the model after the conversion to HLS layers is used."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 29,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "model_for_npysim = ModelWrapper(build_dir+\"/tfc_w1_a1_hls_layers.onnx\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "To generate the code for this simulation and to generate the executable two transformations are used:\n",
+    "* `CodeGen_npysim` which generates the C++ code for the corresponding hls layer\n",
+    "* `Compile` which compules the C++ code and stores the path to the executable"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 30,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from finn.transformation.fpgadataflow.codegen_npysim import CodeGen_npysim\n",
+    "from finn.transformation.fpgadataflow.compile import Compile\n",
+    "\n",
+    "model_for_npysim = model_for_npysim.transform(CodeGen_npysim())\n",
+    "model_for_npysim = model_for_npysim.transform(Compile())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "When we take a look at the model using netron, we can see that the transformations introduced new attributes."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 33,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Serving '/workspace/finn/tfc_w1_a1_for_npysim.onnx' at http://0.0.0.0:8081\n"
+     ]
+    },
+    {
+     "data": {
+      "text/html": [
+       "\n",
+       "        <iframe\n",
+       "            width=\"100%\"\n",
+       "            height=\"400\"\n",
+       "            src=\"http://0.0.0.0:8081/\"\n",
+       "            frameborder=\"0\"\n",
+       "            allowfullscreen\n",
+       "        ></iframe>\n",
+       "        "
+      ],
+      "text/plain": [
+       "<IPython.lib.display.IFrame at 0x7fa3496350f0>"
+      ]
+     },
+     "execution_count": 33,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "model_for_npysim.save(build_dir+\"/tfc_w1_a1_for_npysim.onnx\")\n",
+    "showInNetron(build_dir+\"/tfc_w1_a1_for_npysim.onnx\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The following node attributes have been added:\n",
+    "* `code_gen_dir_npysim` indicates the directory where the files for the simulation using C++ are stored\n",
+    "* `executable_path` specifies the path to the executable\n",
+    "\n",
+    "We take now a closer look into the files that were generated:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 34,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "compile.sh  execute_StreamingFCLayer_Batch.cpp\tnode_model  params.h  thresh.h\r\n"
+     ]
+    }
+   ],
+   "source": [
+    "from finn.custom_op.registry import getCustomOp\n",
+    "\n",
+    "fc0 = model_for_npysim.graph.node[2]\n",
+    "fc0w = getCustomOp(fc0)\n",
+    "code_gen_dir = fc0w.get_nodeattr(\"code_gen_dir_npysim\")\n",
+    "!ls {code_gen_dir}"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Besides the .cpp file, the folder contains .h files with the weights and thresholds. The shell script contains the compile command and *node_model* is the executable generated by compilation. Comparing this with the `executable_path` node attribute, it can be seen that it specifies exactly the path to *node_model*."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "To simulate the model the execution mode(exec_mode) must be set to \"npysim\". This is done using the transformation SetExecMode."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 35,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from finn.transformation.fpgadataflow.set_exec_mode import SetExecMode\n",
+    "\n",
+    "model_for_npysim = model_for_npysim.transform(SetExecMode(\"npysim\"))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now the model can be executed using `execute_onnx`. The function reads the `exec_mode` and writes the input into the correct directory in a .npy file. To be able to read this in C++, there is an additional .hpp file ([npy2apintstream.hpp](https://github.com/Xilinx/finn/blob/master/src/finn/data/cpp/npy2apintstream.hpp)) in FINN, which uses cnpy to read .npy files and convert them into streams, or to read a stream and write it into an .npy. [cnpy](https://github.com/rogersce/cnpy) is a helper to read and write .npy and .npz formates in C++.\n",
+    "\n",
+    "The result is again compared to the \"golden\" output."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 37,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "output_dict = oxe.execute_onnx(model_for_npysim, input_dict)\n",
+    "output_npysim = output_dict[list(output_dict.keys())[0]]\n",
+    "assert np.isclose(output_npysim, output_golden, atol=1e-3).all(), \"The results are not the same!\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Emulation (rtlsim) using PyVerilator\n",
+    "\n",
+    "The emulation using [PyVerilator](https://github.com/maltanar/pyverilator) can be done after IP blocks are generated from the corresponding HLS layers. Pyverilator is a tool which makes it possible to simulate verilog files using verilator via a python interface.\n",
+    "\n",
+    "We have two ways to use rtlsim, one is to run the model node-by-node as with the simulation methods, but if the model is in the form of the dataflow partition, the part of the graph that consist of only HLS nodes could also be executed as whole."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Because at the point where we want to grab and verify the model, the model is already in split form (parent graph consisting of non-hls layers and child graph consisting only of hls layers) we first have to reference the child graph within the parent graph. This is done using the node attribute `model` for the `StreamingDataflowPartition` node.\n",
+    "\n",
+    "First the procedure is shown, if the child graph has ip blocks corresponding to the individual layers, then the procedure is shown, if the child graph already has a stitched IP."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Emulation of model layer-by-layer"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 49,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "child_model = ModelWrapper(build_dir + \"/tfc_w1_a1_ipgen.onnx\")\n",
+    "child_model = child_model.transform(SetExecMode(\"rtlsim\"))\n",
+    "child_model.save(build_dir + \"/tfc_w1_a1_dataflow_child.onnx\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 50,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# parent model\n",
+    "model_for_rtlsim = ModelWrapper(build_dir + \"/tfc_w1_a1_dataflow_parent.onnx\")\n",
+    "# reference child model\n",
+    "sdp_node = getCustomOp(model_for_rtlsim.graph.node[2])\n",
+    "sdp_node.set_nodeattr(\"model\", build_dir + \"/tfc_w1_a1_dataflow_child.onnx\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 51,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "model_for_rtlsim = model_for_rtlsim.transform(SetExecMode(\"rtlsim\"))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Because the necessary files for the emulation are already generated in Jupyter notebook [tfc_end2end_example](tfc_end2end_example.ipynb), in the next step the execution of the model can be done directly."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 52,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "array([[-0.4992097 , -0.24960485,  6.489726  ,  0.9984194 , -0.24960485,\n",
+       "        -2.2464437 ,  0.7488146 , -1.4976292 , -0.4992097 , -2.7456534 ]],\n",
+       "      dtype=float32)"
+      ]
+     },
+     "execution_count": 52,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "output_dict = oxe.execute_onnx(model_for_rtlsim, input_dict)\n",
+    "output_rtlsim = output_dict[list(output_dict.keys())[0]]\n",
+    "assert np.isclose(output_rtlsim, output_golden, atol=1e-3).all(), \"The results are not the same!\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Emulation of stitched IP"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 53,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "child_model = ModelWrapper(build_dir + \"/tfc_w1_a1_ipstitch.onnx\")\n",
+    "child_model.set_metadata_prop(\"exec_mode\",\"rtlsim\")\n",
+    "child_model.save(build_dir + \"/tfc_w1_a1_dataflow_child.onnx\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 54,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# parent model\n",
+    "model_for_rtlsim = ModelWrapper(build_dir + \"/tfc_w1_a1_dataflow_parent.onnx\")\n",
+    "# reference child model\n",
+    "sdp_node = getCustomOp(model_for_rtlsim.graph.node[2])\n",
+    "sdp_node.set_nodeattr(\"model\", build_dir + \"/tfc_w1_a1_dataflow_child.onnx\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 55,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "output_dict = oxe.execute_onnx(model_for_rtlsim, input_dict)\n",
+    "output_rtlsim = output_dict[list(output_dict.keys())[0]]\n",
+    "assert np.isclose(output_rtlsim, output_golden, atol=1e-3).all(), \"The results are not the same!\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.6.8"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
-- 
GitLab