From 68c33fc54a0c3be62733dac9b83605ffd48fdc78 Mon Sep 17 00:00:00 2001 From: auphelia <jakobapk@web.de> Date: Thu, 7 May 2020 18:51:10 +0100 Subject: [PATCH] [Notebook] Update end2end notebooks --- .../end2end_example/tfc_end2end_example.ipynb | 632 +++++++++++------- .../tfc_end2end_verification.ipynb | 66 +- 2 files changed, 412 insertions(+), 286 deletions(-) diff --git a/notebooks/end2end_example/tfc_end2end_example.ipynb b/notebooks/end2end_example/tfc_end2end_example.ipynb index a0e905c83..e9f82dced 100644 --- a/notebooks/end2end_example/tfc_end2end_example.ipynb +++ b/notebooks/end2end_example/tfc_end2end_example.ipynb @@ -34,7 +34,7 @@ "metadata": {}, "source": [ "The white fields show the state of the network representation in the respective step. The colored fields represent the transformations that are applied to the network to achieve a certain result. The diagram is divided into 5 sections represented by a different color, each of it includes several flow steps. The flow starts in top left corner with Brevitas export (green section), followed by the preparation of the network (blue section) for the Vivado HLS synthesis and Vivado IPI stitching (orange section), and finally building a PYNQ overlay bitfile and testing it on a PYNQ board (yellow section).\n", - "There is an additional section for functional verification (red section) on the left side of the diagram, which we will not cover in this notebook. For details please take a look in the verification notebook which you can find [here](tfc_end2end_verification.ipynb)\n", + "There is an additional section for functional verification (red section) on the right side of the diagram, which we will not cover in this notebook. For details please take a look in the verification notebook which you can find [here](tfc_end2end_verification.ipynb)\n", "\n", "\n", "This Jupyter notebook is organized based on the sections described above. We will use the following helper functions, `showSrc` to show source code of FINN library calls and `showInNetron` to show the ONNX model at the current transformation step. The Netron displays are interactive, but they only work when running the notebook actively and not on GitHub (i.e. if you are viewing this on GitHub you'll only see blank squares)." @@ -46,17 +46,9 @@ "metadata": {}, "outputs": [], "source": [ - "import inspect\n", - "import netron\n", + "from finn.util.visualization import showSrc, showInNetron\n", "from finn.util.basic import make_build_dir\n", - "from IPython.display import IFrame\n", "\n", - "def showSrc(what):\n", - " print(\"\".join(inspect.getsourcelines(what)[0]))\n", - " \n", - "def showInNetron(model_filename):\n", - " netron.start(model_filename, port=8081, host=\"0.0.0.0\")\n", - " return IFrame(src=\"http://0.0.0.0:8081/\", width=\"100%\", height=400)\n", " \n", "build_dir = \"/workspace/finn\"" ] @@ -140,7 +132,7 @@ " " ], "text/plain": [ - "<IPython.lib.display.IFrame at 0x7f4310b476a0>" + "<IPython.lib.display.IFrame at 0x7fe1ad0b6e80>" ] }, "execution_count": 3, @@ -173,7 +165,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Now the model is prepared and could be simulated using Python. How this works is described in subsection [Simulation using Python](#simpy) in the section about *Simulation & Emulation Flows*.\n", + "Now the model is prepared and could be simulated using Python. How this works is described in the Jupyter notebook about verification and can be found [here](tfc_end2end_verification.ipynb#simpy).\n", "\n", "The model can now also be processed in different ways. The principle of FINN are analysis and transformation passes, which can be applied to the model. An analysis pass extracts specific information about the model and returns it to the user in the form of a dictionary. A transformation pass changes the model and returns the changed model back to the FINN flow.\n", "\n", @@ -186,10 +178,12 @@ "source": [ "## 2. Network preparation <a id='nw_prep'></a>\n", "\n", + "* [FINN-style Dataflow Architectures](#dataflow_arch)\n", "* [Tidy-up transformations](#basic_trafo)\n", "* [Streamlining](#streamline)\n", "* [Conversion to HLS layers](#hls_layers)\n", - "* [Folding](#folding)\n", + "* [Creating a Dataflow Partition](#dataflow_partition)\n", + "* [Folding and Datawidth Converter, FIFO and TLastMarker Insertion](#folding)\n", "\n", "\n", "In this section, we will put the network through a series of transformations that puts it in a form that can be stitched together to form a FINN-style dataflow architecture, yielding a high-performance, high-efficiency FPGA accelerator." @@ -199,7 +193,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### FINN-style Dataflow Architectures\n", + "### FINN-style Dataflow Architectures <a id='dataflow_arch'></a>\n", "\n", "We start with a quick recap of FINN-style dataflow architectures. The key idea in such architectures is to parallelize across layers as well as within layers by dedicating a proportionate amount of compute resources to each layer, as illustrated in the figure below taken from the [FINN-R paper](https://arxiv.org/pdf/1809.04570.pdf):\n", "\n", @@ -299,7 +293,7 @@ " " ], "text/plain": [ - "<IPython.lib.display.IFrame at 0x7f43177c2a20>" + "<IPython.lib.display.IFrame at 0x7fe1ad0639e8>" ] }, "execution_count": 6, @@ -373,7 +367,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "As can be seen, several transformations are involved in the streamlining transformation. There are move and collapse transformations. In the last step the operations are transformed into multithresholds. The involved transformations can be viewed in detail [here](https://github.com/Xilinx/finn/tree/dev/src/finn/transformation/streamline). After each transformation, three of the tidy-up transformations (`GiveUniqueNodeNames`, `GiveReadableTensorNames` and `InferDataTypes`) are applied to the model.\n", + "As can be seen, several transformations are involved in the streamlining transformation. There are move and collapse transformations. In the last step the operations are transformed into multithresholds. The involved transformations can be viewed in detail [here](https://github.com/Xilinx/finn/tree/master/src/finn/transformation/streamline). After each transformation, three of the tidy-up transformations (`GiveUniqueNodeNames`, `GiveReadableTensorNames` and `InferDataTypes`) are applied to the model.\n", "\n", "After streamlining the network looks as follows:" ] @@ -406,7 +400,7 @@ " " ], "text/plain": [ - "<IPython.lib.display.IFrame at 0x7f431826d860>" + "<IPython.lib.display.IFrame at 0x7fe1346e4ef0>" ] }, "execution_count": 8, @@ -460,7 +454,7 @@ " " ], "text/plain": [ - "<IPython.lib.display.IFrame at 0x7f42977e39b0>" + "<IPython.lib.display.IFrame at 0x7fe1346f7780>" ] }, "execution_count": 9, @@ -494,11 +488,18 @@ "metadata": {}, "source": [ "### Conversion to HLS layers <a id='hls_layers'></a>\n", - "Converts the nodes to HLS layers that correspond to the functions in [finn-hls library](https://finn-hlslib.readthedocs.io/en/latest/). In our case this transformation onverts pairs of binary XnorPopcountMatMul layers to StreamingFCLayer_Batch layers. Any immediately following MultiThreshold layers will also be absorbed into the MVTU.\n", + "Converts the nodes to HLS layers that correspond to the functions in [finn-hls library](https://finn-hlslib.readthedocs.io/en/latest/). In our case this transformation converts pairs of binary XnorPopcountMatMul layers to StreamingFCLayer_Batch layers. Any immediately following MultiThreshold layers will also be absorbed into the MVTU.\n", "\n", "Below is the code for the transformation and the network is visualized using netron to create the new structure with `StreamingFCLayer_Batch` nodes, which will correspond to a function call from the [finn-hlslib](https://finn-hlslib.readthedocs.io/en/latest/library/fclayer.html#_CPPv4I_j_j_j_j000_i_i000E22StreamingFCLayer_BatchvRN3hls6streamI7ap_uintI9InStreamWEEERN3hls6streamI7ap_uintI10OutStreamWEEERK2TWRK2TAKjRK1R) library." ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Note:** The transformation `to_hls.InferBinaryStreamingFCLayer` gets the string \"decoupled\" as argument, this indicates the `mem_mode` for the weights. In FINN there are different options to set the way the weights are stored and accessed. For details please see the corresponding FINN readthedocs website." + ] + }, { "cell_type": "code", "execution_count": 10, @@ -529,7 +530,7 @@ " " ], "text/plain": [ - "<IPython.lib.display.IFrame at 0x7f43177c73c8>" + "<IPython.lib.display.IFrame at 0x7fe1346f1080>" ] }, "execution_count": 10, @@ -540,7 +541,7 @@ "source": [ "import finn.transformation.fpgadataflow.convert_to_hls_layers as to_hls\n", "model = ModelWrapper(build_dir+\"/tfc_w1a1_ready_for_hls_conversion.onnx\")\n", - "model = model.transform(to_hls.InferBinaryStreamingFCLayer())\n", + "model = model.transform(to_hls.InferBinaryStreamingFCLayer(\"decoupled\"))\n", "model.save(build_dir+\"/tfc_w1_a1_hls_layers.onnx\")\n", "showInNetron(build_dir+\"/tfc_w1_a1_hls_layers.onnx\")" ] @@ -589,7 +590,7 @@ " " ], "text/plain": [ - "<IPython.lib.display.IFrame at 0x7f43177c2f60>" + "<IPython.lib.display.IFrame at 0x7fe1ad0b6e48>" ] }, "execution_count": 11, @@ -624,7 +625,7 @@ "text": [ "\n", "Stopping http://0.0.0.0:8081\n", - "Serving '/tmp/finn_jakobap/dataflow_partition_sqcfkplo/df_model.onnx' at http://0.0.0.0:8081\n" + "Serving '/tmp/finn_dev_jakobap/dataflow_partition_pbrjefjg/df_model.onnx' at http://0.0.0.0:8081\n" ] }, { @@ -641,7 +642,7 @@ " " ], "text/plain": [ - "<IPython.lib.display.IFrame at 0x7f42977d4978>" + "<IPython.lib.display.IFrame at 0x7fe1346f3550>" ] }, "execution_count": 12, @@ -676,50 +677,23 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Folding and TLastMarker Insertion <a id='folding'></a>\n", + "### Folding and Datawidth Converter, FIFO and TLastMarker Insertion <a id='folding'></a>\n", "\n", "*Folding* in FINN describes how much a layer is time-multiplexed in terms of execution resources. There are several *folding factors* for each layer, controlled by the PE (parallelization over outputs) and SIMD (parallelization over inputs) parameters as described by the original [FINN paper](https://arxiv.org/pdf/1612.07119). The higher the PE and SIMD values are set, the faster the generated accelerator will run, and the more FPGA resources it will consume. \n", "\n", - "Since the folding parameters are node attributes, they can be easily accessed and changed using a helper function of the `ModelWrapper`. But first we have to extract the nodes which are StreamingFCLayer_Batch operations. This is where the Netron visualization helps us, in the above diagram we can see that the first four nodes are StreamingFCLayer_Batch. Through the `print`s we can check if the extracted nodes all have the op_type \"StreamingFCLayer_Batch\"." - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "fc0 has the op_type: StreamingFCLayer_Batch\n", - "fc1 has the op_type: StreamingFCLayer_Batch\n", - "fc2 has the op_type: StreamingFCLayer_Batch\n", - "fc3 has the op_type: StreamingFCLayer_Batch\n" - ] - } - ], - "source": [ - "fc0 = model.graph.node[0]\n", - "fc1 = model.graph.node[1]\n", - "fc2 = model.graph.node[2]\n", - "fc3 = model.graph.node[3]\n", - "print(\"fc0 has the op_type: \" + str(fc0.op_type))\n", - "print(\"fc1 has the op_type: \" + str(fc1.op_type))\n", - "print(\"fc2 has the op_type: \" + str(fc2.op_type))\n", - "print(\"fc3 has the op_type: \" + str(fc3.op_type))" + "Since the folding parameters are node attributes, they can be easily accessed and changed using a helper function of the `ModelWrapper`. But first we take a closer look at one of the nodes that implement a StreamingFCLayer_Batch operation. This is where the Netron visualization helps us, in the above diagram we can see that the first four nodes are StreamingFCLayer_Batch. So as an example we extract the first node." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "We can use the higher-level [HLSCustomOp](https://github.com/Xilinx/finn/blob/master/src/finn/custom_op/fpgadataflow/__init__.py) wrappers for these nodes. These wrappers provide easy access to specific properties of these nodes, such as the folding factors (PE and SIMD). Let's have a look at which node attributes are defined by the CustomOp wrapper, and adjust the SIMD and PE attributes." + "We can use the higher-level [HLSCustomOp](https://github.com/Xilinx/finn/blob/master/src/finn/custom_op/fpgadataflow/__init__.py) wrappers for this node. These wrappers provide easy access to specific properties of these nodes, such as the folding factors (PE and SIMD). Let's have a look at which node attributes are defined by the CustomOp wrapper, and adjust the SIMD and PE attributes." ] }, { "cell_type": "code", - "execution_count": 15, + "execution_count": 14, "metadata": {}, "outputs": [ { @@ -764,16 +738,14 @@ " 'outFIFODepth': ('i', False, 2)}" ] }, - "execution_count": 15, + "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ + "fc0 = model.graph.node[0]\n", "fc0w = getCustomOp(fc0)\n", - "fc1w = getCustomOp(fc1)\n", - "fc2w = getCustomOp(fc2)\n", - "fc3w = getCustomOp(fc3)\n", "\n", "print(\"CustomOp wrapper is of class \" + fc0w.__class__.__name__)\n", "\n", @@ -790,48 +762,56 @@ }, { "cell_type": "code", - "execution_count": 16, + "execution_count": 15, "metadata": {}, "outputs": [], "source": [ - "# SIMD controls the folding over the input vector\n", - "# PE controls the folding over the output vector\n", - "\n", - "fc0w.set_nodeattr(\"inFIFODepth\", 50)\n", - "fc0w.set_nodeattr(\"SIMD\", 16)\n", - "fc0w.set_nodeattr(\"PE\", 16)\n", - "fc0w.set_nodeattr(\"outFIFODepth\", 4)\n", - "\n", - "fc1w.set_nodeattr(\"inFIFODepth\", 4)\n", - "fc1w.set_nodeattr(\"SIMD\", 16)\n", - "fc1w.set_nodeattr(\"PE\", 16)\n", - "fc1w.set_nodeattr(\"outFIFODepth\", 4)\n", - "\n", - "fc2w.set_nodeattr(\"inFIFODepth\", 4)\n", - "fc2w.set_nodeattr(\"SIMD\", 16)\n", - "fc2w.set_nodeattr(\"PE\", 16)\n", - "fc2w.set_nodeattr(\"outFIFODepth\", 4)\n", - "\n", - "fc3w.set_nodeattr(\"inFIFODepth\", 4)\n", - "fc3w.set_nodeattr(\"SIMD\", 16)\n", - "fc3w.set_nodeattr(\"PE\", 10)\n", - "fc3w.set_nodeattr(\"outFIFODepth\", 50)\n" + "fc_layers = model.get_nodes_by_op_type(\"StreamingFCLayer_Batch\")\n", + "# (PE, SIMD, in_fifo_depth, out_fifo_depth, ramstyle) for each layer\n", + "config = [\n", + " (16, 49, 16, 64, \"block\"),\n", + " (8, 8, 64, 64, \"auto\"),\n", + " (8, 8, 64, 64, \"auto\"),\n", + " (10, 8, 64, 10, \"distributed\"),\n", + "]\n", + "for fcl, (pe, simd, ififo, ofifo, ramstyle) in zip(fc_layers, config):\n", + " fcl_inst = getCustomOp(fcl)\n", + " fcl_inst.set_nodeattr(\"PE\", pe)\n", + " fcl_inst.set_nodeattr(\"SIMD\", simd)\n", + " fcl_inst.set_nodeattr(\"inFIFODepth\", ififo)\n", + " fcl_inst.set_nodeattr(\"outFIFODepth\", ofifo)\n", + " fcl_inst.set_nodeattr(\"ram_style\", ramstyle)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We are setting PE and SIMD so that each layer has a total folding of 16." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "After setting the FIFO node attributes, we can insert FIFO nodes inbetween the fpgadataflow nodes and in the beginning and end of the graph. This can be done using the transformation `InsertFIFO`." + "Besides PE and SIMD three other node attributes are set. `ram_style` specifies how the weights are to be stored (BRAM, LUTRAM, and so on). It can be selected explicitly or with the option `auto` you can let Vivado decide.\n", + "`inFIFODepth` and `outFIFODepth` specifies the FIFO depths that is needed by the node from the surrounding FIFOs. These attributes are used in the transformation 'InsertFIFO' to insert the appropriate FIFOs between the nodes.\n", + "\n", + "But before FIFOs can be added, it must be determined whether datawidth converters (DWC) are required and they must be inserted correctly. Because by setting the folding, the folded output shape of one node may not match the folded input shape of the next node. \n", + "\n", + "In the following, first DWCs and then FIFOs are inserted using the corresponding transformations in FINN." ] }, { "cell_type": "code", - "execution_count": 17, + "execution_count": 16, "metadata": {}, "outputs": [], "source": [ + "from finn.transformation.fpgadataflow.insert_dwc import InsertDWC\n", "from finn.transformation.fpgadataflow.insert_fifo import InsertFIFO\n", + "\n", + "model = model.transform(InsertDWC())\n", "model = model.transform(InsertFIFO())" ] }, @@ -839,13 +819,15 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Finally, we will run the `InsertTLastMarker` transformation to get a `TLastMarker` node at the output of this graph, which is necessary to run the DMA engines correctly. Using netron we can observe that now the nodes contain the set folding, inbetween the nodes are FIFOs inserted and the last node is the `TLastMarker` node we insert in the following." + "Finally, we will run the `InsertTLastMarker` transformation to get a `TLastMarker` node at the output of this graph, which is necessary to run the DMA engines correctly. Using netron we can observe that now the nodes contain the set folding, if necessary a DWC is inserted, inbetween the nodes are FIFOs inserted and the last node is the `TLastMarker` node we insert in the following." ] }, { "cell_type": "code", - "execution_count": 18, - "metadata": {}, + "execution_count": 17, + "metadata": { + "scrolled": true + }, "outputs": [ { "name": "stdout", @@ -870,7 +852,7 @@ " " ], "text/plain": [ - "<IPython.lib.display.IFrame at 0x7f43177c7518>" + "<IPython.lib.display.IFrame at 0x7fe135b84780>" ] }, "execution_count": 17, @@ -889,7 +871,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "This completes the network preparation and the network can be passed on to the next block *Vivado HLS and Vivado synthesis*, which is described below." + "This completes the network preparation and the network can be passed on to the next block *Vivado HLS and IPI*, which is described below." ] }, { @@ -906,7 +888,7 @@ }, { "cell_type": "code", - "execution_count": 19, + "execution_count": 18, "metadata": {}, "outputs": [ { @@ -925,7 +907,7 @@ }, { "cell_type": "code", - "execution_count": 20, + "execution_count": 19, "metadata": {}, "outputs": [], "source": [ @@ -956,7 +938,7 @@ }, { "cell_type": "code", - "execution_count": 21, + "execution_count": 20, "metadata": {}, "outputs": [], "source": [ @@ -978,7 +960,7 @@ }, { "cell_type": "code", - "execution_count": 22, + "execution_count": 21, "metadata": {}, "outputs": [], "source": [ @@ -997,7 +979,7 @@ }, { "cell_type": "code", - "execution_count": 23, + "execution_count": 22, "metadata": {}, "outputs": [ { @@ -1023,10 +1005,10 @@ " " ], "text/plain": [ - "<IPython.lib.display.IFrame at 0x7f42977edf60>" + "<IPython.lib.display.IFrame at 0x7fe1346f7588>" ] }, - "execution_count": 23, + "execution_count": 22, "metadata": {}, "output_type": "execute_result" } @@ -1048,19 +1030,23 @@ }, { "cell_type": "code", - "execution_count": 24, + "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "project_StreamingFIFO_0\r\n" + "StreamingFCLayer_Batch_0_memstream.v thresh.h\r\n", + "hls_syn_StreamingFCLayer_Batch_0.tcl top_StreamingFCLayer_Batch_0.cpp\r\n", + "ipgen.sh\t\t\t vivado_hls.log\r\n", + "memblock_0.dat\t\t\t weights.npy\r\n", + "project_StreamingFCLayer_Batch_0\r\n" ] } ], "source": [ - "fc0w = getCustomOp(model.graph.node[0])\n", + "fc0w = getCustomOp(model.graph.node[1])\n", "code_gen_dir = fc0w.get_nodeattr(\"code_gen_dir_ipgen\")\n", "!ls {code_gen_dir}" ] @@ -1069,12 +1055,14 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Directory *project_StreamingFCLayer_Batch_0* contains the project created by Vivado HLS into which the IP Block is exported, along with other files generated by Vivado HLS. If we compare it to the above visualization of the network with netron, this is exactly the name of the folder stored in the node attribute `ipgen_path`. The .cpp code that is passed to Vivado HLS can be found in the file *top_StreamingFCLayer_Batch_0.cpp*. The files *params.h* and *thresh.h* belong to that as well, they contain the values for the weights and thresholds. *vivado_hls.log* is the log file from Vivado HLS. Besides these files, the folder contains *ipgen.sh* and *hls_syn_StreamingFCLayer_Batch_0.tcl*. First we take a look at *ipgen.sh*." + "Directory *project_StreamingFCLayer_Batch_0* contains the project created by Vivado HLS into which the IP Block is exported, along with other files generated by Vivado HLS. If we compare it to the above visualization of the network with netron, this is exactly the name of the folder stored in the node attribute `ipgen_path`. The .cpp code that is passed to Vivado HLS can be found in the file *top_StreamingFCLayer_Batch_0.cpp*. The file *thresh.h* belongs to that as well, it contains the value for the thresholds. The weights are stored as .npy file and as .dat file (*memblock_0.dat*). *vivado_hls.log* is the log file from Vivado HLS. Besides these files, the folder contains *ipgen.sh* and *hls_syn_StreamingFCLayer_Batch_0.tcl* and because we use the StreamingFCLayer in \"decoupled\" mode a verilog wrapper (*StreamingFCLayer_Batch_0_memstream.v*) is produced, for more details on \"decoupled\" and \"const\" mode please see on the readthedocs website. \n", + "\n", + "In the following we take a closer look at the two generated scripts. We start with *ipgen.sh*." ] }, { "cell_type": "code", - "execution_count": 25, + "execution_count": 24, "metadata": {}, "outputs": [ { @@ -1082,8 +1070,8 @@ "output_type": "stream", "text": [ "#!/bin/bash \r\n", - "cd /tmp/finn_jakobap/code_gen_ipgen_StreamingFCLayer_Batch_0_pfp8r_i6\r\n", - "vivado_hls /tmp/finn_jakobap/code_gen_ipgen_StreamingFCLayer_Batch_0_pfp8r_i6/hls_syn_StreamingFCLayer_Batch_0.tcl\r\n", + "cd /tmp/finn_dev_jakobap/code_gen_ipgen_StreamingFCLayer_Batch_0_edb__5oc\r\n", + "vivado_hls /tmp/finn_dev_jakobap/code_gen_ipgen_StreamingFCLayer_Batch_0_edb__5oc/hls_syn_StreamingFCLayer_Batch_0.tcl\r\n", "cd /workspace/finn\r\n" ] } @@ -1104,7 +1092,7 @@ }, { "cell_type": "code", - "execution_count": 26, + "execution_count": 25, "metadata": {}, "outputs": [ { @@ -1114,14 +1102,14 @@ "\r\n", "set config_proj_name project_StreamingFCLayer_Batch_0\r\n", "puts \"HLS project: $config_proj_name\"\r\n", - "set config_hwsrcdir \"/tmp/finn_jakobap/code_gen_ipgen_StreamingFCLayer_Batch_0_pfp8r_i6\"\r\n", + "set config_hwsrcdir \"/tmp/finn_dev_jakobap/code_gen_ipgen_StreamingFCLayer_Batch_0_edb__5oc\"\r\n", "puts \"HW source dir: $config_hwsrcdir\"\r\n", "set config_proj_part \"xc7z020clg400-1\"\r\n", "\r\n", "set config_bnnlibdir \"/workspace/finn-hlslib\"\r\n", "\r\n", "set config_toplevelfxn \"StreamingFCLayer_Batch_0\"\r\n", - "set config_clkperiod 5\r\n", + "set config_clkperiod 10\r\n", "\r\n", "open_project $config_proj_name\r\n", "add_files $config_hwsrcdir/top_StreamingFCLayer_Batch_0.cpp -cflags \"-std=c++0x -I$config_bnnlibdir\"\r\n", @@ -1133,6 +1121,7 @@ "config_interface -m_axi_addr64\r\n", "config_rtl -auto_prefix\r\n", "\r\n", + "\r\n", "create_clock -period $config_clkperiod -name default\r\n", "csynth_design\r\n", "export_design -format ip_catalog\r\n", @@ -1165,7 +1154,7 @@ }, { "cell_type": "code", - "execution_count": 27, + "execution_count": 26, "metadata": {}, "outputs": [], "source": [ @@ -1185,22 +1174,22 @@ }, { "cell_type": "code", - "execution_count": 28, + "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[key: \"vivado_stitch_proj\"\n", - "value: \"/tmp/finn_jakobap/vivado_stitch_proj_tqp4ib4j\"\n", + "value: \"/tmp/finn_dev_jakobap/vivado_stitch_proj_oa43bqzl\"\n", ", key: \"vivado_stitch_vlnv\"\n", "value: \"xilinx_finn:finn:finn_design:1.0\"\n", ", key: \"wrapper_filename\"\n", - "value: \"/tmp/finn_jakobap/vivado_stitch_proj_tqp4ib4j/finn_vivado_stitch_proj.srcs/sources_1/bd/finn_design/hdl/finn_design_wrapper.v\"\n", + "value: \"/tmp/finn_dev_jakobap/vivado_stitch_proj_oa43bqzl/finn_vivado_stitch_proj.srcs/sources_1/bd/finn_design/hdl/finn_design_wrapper.v\"\n", "]" ] }, - "execution_count": 28, + "execution_count": 27, "metadata": {}, "output_type": "execute_result" } @@ -1211,16 +1200,16 @@ }, { "cell_type": "code", - "execution_count": 29, + "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "'/tmp/finn_jakobap/vivado_stitch_proj_tqp4ib4j'" + "'/tmp/finn_dev_jakobap/vivado_stitch_proj_oa43bqzl'" ] }, - "execution_count": 29, + "execution_count": 28, "metadata": {}, "output_type": "execute_result" } @@ -1245,7 +1234,7 @@ }, { "cell_type": "code", - "execution_count": 30, + "execution_count": 29, "metadata": {}, "outputs": [], "source": [ @@ -1266,9 +1255,10 @@ "## 4. PYNQ hardware generation and deployment <a id='hw_test'></a>\n", "\n", "* [Inserting the IP into a PYNQ Overlay Shell](#pynq_shell)\n", - "* [Synthesis, place and route](#synth_pl_ro)\n", + "* [Synthesis, Place and Route](#synth_pl_ro)\n", "* [Driver Generation](#driver_gen)\n", "* [Deployment and Remote Execution](#deploy)\n", + "* [Throughput Test on PYNQ Board](#throughput)\n", "\n", "\n", "We are almost done preparing our hardware design. We'll now put it in a form suitable for use as a PYNQ overlay, synthesize and deploy it." @@ -1280,12 +1270,12 @@ "source": [ "### Inserting the IP into a PYNQ Overlay Shell <a id='pynq_shell'></a>\n", "\n", - "We are almost done preparing our hardware design. To deploy our accelerator on a PYNQ platform, it needs to be put inside an appropriate *shell* that bridges it with the interfaces that the underlying system exposes. FINN makes it easy to create a PYNQ-compatible overlay by inserting the stitched IP into an appropriate PYNQ shell with the `MakePYNQProject` transformation, and view the created PYNQ shell project directory using the `metadata_props`. **This invokes Vivado and may take a few minutes to run.**" + "To deploy our accelerator on a PYNQ platform, it needs to be put inside an appropriate *shell* that bridges it with the interfaces that the underlying system exposes. FINN makes it easy to create a PYNQ-compatible overlay by inserting the stitched IP into an appropriate PYNQ shell with the `MakePYNQProject` transformation, and view the created PYNQ shell project directory using the `metadata_props`. **This invokes Vivado and may take a few minutes to run.**" ] }, { "cell_type": "code", - "execution_count": 31, + "execution_count": 30, "metadata": { "scrolled": true }, @@ -1294,19 +1284,19 @@ "data": { "text/plain": [ "[key: \"vivado_stitch_proj\"\n", - "value: \"/tmp/finn_jakobap/vivado_stitch_proj_tqp4ib4j\"\n", + "value: \"/tmp/finn_dev_jakobap/vivado_stitch_proj_oa43bqzl\"\n", ", key: \"vivado_stitch_vlnv\"\n", "value: \"xilinx_finn:finn:finn_design:1.0\"\n", ", key: \"wrapper_filename\"\n", - "value: \"/tmp/finn_jakobap/vivado_stitch_proj_tqp4ib4j/finn_vivado_stitch_proj.srcs/sources_1/bd/finn_design/hdl/finn_design_wrapper.v\"\n", + "value: \"/tmp/finn_dev_jakobap/vivado_stitch_proj_oa43bqzl/finn_vivado_stitch_proj.srcs/sources_1/bd/finn_design/hdl/finn_design_wrapper.v\"\n", ", key: \"vivado_pynq_proj\"\n", - "value: \"/tmp/finn_jakobap/vivado_pynq_proj_gkwfg31j\"\n", + "value: \"/tmp/finn_dev_jakobap/vivado_pynq_proj_ljn53hfs\"\n", ", key: \"vivado_synth_rpt\"\n", - "value: \"/tmp/finn_jakobap/vivado_pynq_proj_gkwfg31j/synth_report.xml\"\n", + "value: \"/tmp/finn_dev_jakobap/vivado_pynq_proj_ljn53hfs/synth_report.xml\"\n", "]" ] }, - "execution_count": 31, + "execution_count": 30, "metadata": {}, "output_type": "execute_result" } @@ -1320,7 +1310,7 @@ }, { "cell_type": "code", - "execution_count": 32, + "execution_count": 31, "metadata": {}, "outputs": [ { @@ -1346,7 +1336,7 @@ }, { "cell_type": "code", - "execution_count": 33, + "execution_count": 32, "metadata": {}, "outputs": [], "source": [ @@ -1357,7 +1347,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Synthesis, place and route <a id='synth_pl_ro'></a>" + "### Synthesis, Place and Route <a id='synth_pl_ro'></a>" ] }, { @@ -1369,7 +1359,7 @@ }, { "cell_type": "code", - "execution_count": 34, + "execution_count": 33, "metadata": { "scrolled": true }, @@ -1378,21 +1368,21 @@ "data": { "text/plain": [ "[key: \"vivado_stitch_proj\"\n", - "value: \"/tmp/finn_jakobap/vivado_stitch_proj_tqp4ib4j\"\n", + "value: \"/tmp/finn_dev_jakobap/vivado_stitch_proj_oa43bqzl\"\n", ", key: \"vivado_stitch_vlnv\"\n", "value: \"xilinx_finn:finn:finn_design:1.0\"\n", ", key: \"wrapper_filename\"\n", - "value: \"/tmp/finn_jakobap/vivado_stitch_proj_tqp4ib4j/finn_vivado_stitch_proj.srcs/sources_1/bd/finn_design/hdl/finn_design_wrapper.v\"\n", + "value: \"/tmp/finn_dev_jakobap/vivado_stitch_proj_oa43bqzl/finn_vivado_stitch_proj.srcs/sources_1/bd/finn_design/hdl/finn_design_wrapper.v\"\n", ", key: \"vivado_pynq_proj\"\n", - "value: \"/tmp/finn_jakobap/vivado_pynq_proj_gkwfg31j\"\n", + "value: \"/tmp/finn_dev_jakobap/vivado_pynq_proj_ljn53hfs\"\n", ", key: \"vivado_synth_rpt\"\n", - "value: \"/tmp/finn_jakobap/vivado_pynq_proj_gkwfg31j/synth_report.xml\"\n", + "value: \"/tmp/finn_dev_jakobap/vivado_pynq_proj_ljn53hfs/synth_report.xml\"\n", ", key: \"vivado_pynq_bitfile\"\n", - "value: \"/tmp/finn_jakobap/vivado_pynq_proj_gkwfg31j/resizer.bit\"\n", + "value: \"/tmp/finn_dev_jakobap/vivado_pynq_proj_ljn53hfs/resizer.bit\"\n", "]" ] }, - "execution_count": 34, + "execution_count": 33, "metadata": {}, "output_type": "execute_result" } @@ -1406,7 +1396,7 @@ }, { "cell_type": "code", - "execution_count": 35, + "execution_count": 34, "metadata": {}, "outputs": [], "source": [ @@ -1417,14 +1407,14 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Driver generation <a id='driver_gen'></a>\n", + "### Driver Generation <a id='driver_gen'></a>\n", "\n", "Now that we have synthesized a bitfile for our network, we will generate some Python code for PYNQ that will act as the driver for this bitfile, package everything into a deployment folder and copy that to our PYNQ board." ] }, { "cell_type": "code", - "execution_count": 36, + "execution_count": 35, "metadata": {}, "outputs": [], "source": [ @@ -1442,7 +1432,7 @@ }, { "cell_type": "code", - "execution_count": 44, + "execution_count": 36, "metadata": {}, "outputs": [ { @@ -1462,110 +1452,142 @@ ")\r\n", "from finn.core.datatype import DataType\r\n", "\r\n", - "def load_input(N):\r\n", - " ishape_normal = (N, 784)\r\n", - " # load desired input .npy file\r\n", - " ibuf_normal = np.load(\"input.npy\")\r\n", - " # ensure that shape is as expected\r\n", - " assert ibuf_normal.shape == ishape_normal\r\n", - " return ibuf_normal\r\n", + "class FINNAccelDriver():\r\n", + " def __init__(self, N, bitfile):\r\n", + " \"\"\"Instantiate the FINN accelerator driver.\r\n", + " Gets batchsize (N) as integer and path to bitfile as string.\"\"\"\r\n", + " self.N = N\r\n", + " # input FINN DataType\r\n", + " self.idt = DataType.BINARY\r\n", + " # output FINN DataType\r\n", + " self.odt = DataType.UINT32\r\n", + " # input and output shapes\r\n", + " self.ishape_normal = (N, 784)\r\n", + " self.oshape_normal = (N, 10)\r\n", + " self.ishape_folded = (N, 16, 49)\r\n", + " self.oshape_folded = (N, 1, 10)\r\n", + " self.ishape_packed = (N, 16, 7) # datatype np.uint8\r\n", + " self.oshape_packed = (N, 1, 40) # datatype np.uint8\r\n", + " # load bitfile and set up accelerator\r\n", + " self.ol = Overlay(bitfile)\r\n", + " self.dma = self.ol.axi_dma_0\r\n", + " self.ctrl_regs = self.ol.resize_accel_0\r\n", + " # neuron folding factor of output = iterations per sample\r\n", + " self.itersPerSample = self.oshape_packed[-2]\r\n", + " # AXI lite register offset for number of iterations\r\n", + " # used by TLastMarker to signal end of transmission for AXI CDMA\r\n", + " self.REG_OFFSET_NUM_ITERS = 0x10\r\n", + " # set up TLastMarker with correct num. samples\r\n", + " self.ctrl_regs.write(self.REG_OFFSET_NUM_ITERS, self.N*self.itersPerSample)\r\n", "\r\n", - "def pack_input(ibuf_normal, N):\r\n", - " # input FINN DataType\r\n", - " idt = DataType.BINARY\r\n", - " ishape_folded = (N, 49, 16)\r\n", - " # convert to folded form\r\n", - " ibuf_folded = ibuf_normal.reshape(ishape_folded)\r\n", - " # pack the input buffer, reversing both SIMD dim and endianness\r\n", - " ibuf_packed = finnpy_to_packed_bytearray(\r\n", - " ibuf_folded, idt, reverse_endian=True, reverse_inner=True\r\n", - " )\r\n", - " return ibuf_packed\r\n", + " # allocate a PYNQ buffer for the packed input and buffer\r\n", + " self.ibuf_packed_device = allocate(shape=self.ishape_packed, dtype=np.uint8)\r\n", + " self.obuf_packed_device = allocate(shape=self.oshape_packed, dtype=np.uint8)\r\n", "\r\n", - "def unpack_output(obuf_packed, N):\r\n", - " # output FINN DataType\r\n", - " odt = DataType.UINT32\r\n", - " oshape_folded = (N, 1, 10)\r\n", - " # unpack the packed output buffer from accelerator\r\n", - " obuf_folded = packed_bytearray_to_finnpy(\r\n", - " obuf_packed, odt, oshape_folded, reverse_endian=True, reverse_inner=True\r\n", - " )\r\n", - " return obuf_folded\r\n", + " def fold_input(self, ibuf_normal):\r\n", + " \"\"\"Reshapes input in desired shape.\r\n", + " Gets input data (ibuf_normal), checks if data is in expected normal shape.\r\n", + " Returns folded input.\"\"\"\r\n", + " # ensure that shape is as expected\r\n", + " assert ibuf_normal.shape == self.ishape_normal\r\n", + " # convert to folded form\r\n", + " ibuf_folded = ibuf_normal.reshape(self.ishape_folded)\r\n", + " return ibuf_folded\r\n", "\r\n", - "def save_output(obuf_folded, N):\r\n", - " # convert to normal reshape and save\r\n", - " oshape_normal = (N, 10)\r\n", - " obuf_normal = obuf_folded.reshape(oshape_normal)\r\n", - " np.save(\"output.npy\", obuf_normal)\r\n", + " def pack_input(self, ibuf_folded):\r\n", + " \"\"\"Packs folded input and reverses both SIMD dim and endianness.\r\n", + " Gets input data in folded shape and returns packed input data.\"\"\"\r\n", + " ibuf_packed = finnpy_to_packed_bytearray(\r\n", + " ibuf_folded, self.idt, reverse_endian=True, reverse_inner=True\r\n", + " )\r\n", + " return ibuf_packed\r\n", "\r\n", - "if __name__ == \"__main__\":\r\n", - " parser = argparse.ArgumentParser(description='Please select functional verification (\"remote_pynq\") or throughput test (\"throughput_test\")')\r\n", - " parser.add_argument('exec_mode', help='metadata prop exec_mode as string')\r\n", - " args = parser.parse_args()\r\n", - " exec_mode = args.exec_mode\r\n", + " def unpack_output(self, obuf_packed):\r\n", + " \"\"\"Unpacks the packed output buffer from accelerator.\r\n", + " Gets packed output and returns output data in folded shape.\"\"\"\r\n", + " obuf_folded = packed_bytearray_to_finnpy(\r\n", + " obuf_packed, self.odt, self.oshape_folded, reverse_endian=True, reverse_inner=True\r\n", + " )\r\n", + " return obuf_folded\r\n", "\r\n", - " bitfile_path = \"resizer.bit\"\r\n", - " ol = Overlay(bitfile_path)\r\n", - " dma=ol.axi_dma_0\r\n", - " ctrl_regs=ol.resize_accel_0\r\n", - " # AXI lite register offset for number of iterations\r\n", - " # used by TLastMarker to signal end of transmission for AXI CDMA\r\n", - " REG_OFFSET_NUM_ITERS = 0x10\r\n", + " def unfold_output(self, obuf_folded):\r\n", + " \"\"\"Unfolds output data to normal shape.\r\n", + " Gets folded output data and returns output data in normal shape.\"\"\"\r\n", + " obuf_normal = obuf_folded.reshape(self.oshape_normal)\r\n", + " return obuf_normal\r\n", "\r\n", - " # number of samples for inference\r\n", - " if exec_mode == \"remote_pynq\":\r\n", - " N = 1\r\n", - " elif exec_mode == \"throughput_test\":\r\n", - " res={}\r\n", - " N = 1000\r\n", - " else:\r\n", - " raise Exception(\"Exec mode has to be set to remote_pynq or throughput_test\")\r\n", + " def copy_input_data_to_device(self, data):\r\n", + " \"\"\"Copies given input data to PYNQ buffer.\"\"\"\r\n", + " np.copyto(self.ibuf_packed_device, data)\r\n", "\r\n", - " # declare input/output types and shapes for the accelerator\r\n", - " ishape_packed = (N, 49, 2)\r\n", - " oshape_packed = (N, 1, 40)\r\n", - " \r\n", - " if exec_mode == \"remote_pynq\":\r\n", - " ibuf_normal = load_input(N)\r\n", - " ibuf_packed = pack_input(ibuf_normal, N)\r\n", - " elif exec_mode == \"throughput_test\":\r\n", - " ibuf_packed = np.asarray(np.random.uniform(low=0, high=1, size=tuple(ishape_packed)), dtype=np.uint8)\r\n", + " def execute(self):\r\n", + " \"\"\"Executes accelerator by setting up the DMA and\r\n", + " waiting until all transfers complete. Uses only member variables and\r\n", + " returns nothing.\"\"\"\r\n", + " dma = self.dma\r\n", + " dma.sendchannel.transfer(self.ibuf_packed_device)\r\n", + " dma.recvchannel.transfer(self.obuf_packed_device)\r\n", + " dma.sendchannel.wait()\r\n", + " dma.recvchannel.wait()\r\n", "\r\n", - " # set up TLastMarker with correct num. samples\r\n", - " ctrl_regs.write(REG_OFFSET_NUM_ITERS, N)\r\n", "\r\n", - " # allocate a PYNQ buffer for the packed input buffer\r\n", - " ibuf_packed_device = allocate(shape=ishape_packed, dtype=np.uint8)\r\n", - " # copy the packed data into the PYNQ buffer\r\n", - " # TODO optimization: pack directly into the PYNQ buffer?\r\n", - " np.copyto(ibuf_packed_device, ibuf_packed)\r\n", + "if __name__ == \"__main__\":\r\n", + " parser = argparse.ArgumentParser(description='Set exec mode, batchsize N, bitfile name, inputfile name and outputfile name')\r\n", + " parser.add_argument('--exec_mode', help='Please select functional verification (\"execute\") or throughput test (\"throughput_test\")', default=\"execute\")\r\n", + " parser.add_argument('--batchsize', help='number of samples for inference', type=int, default=1)\r\n", + " parser.add_argument('--bitfile', help='name of bitfile (i.e. \"resizer.bit\")', default=\"resizer.bit\")\r\n", + " parser.add_argument('--inputfile', help='name of input npy file (i.e. \"input.npy\")', default=\"input.npy\")\r\n", + " parser.add_argument('--outputfile', help='name of output npy file (i.e. \"output.npy\")', default=\"output.npy\")\r\n", + " # parse arguments\r\n", + " args = parser.parse_args()\r\n", + " exec_mode = args.exec_mode\r\n", + " N = args.batchsize\r\n", + " bitfile = args.bitfile\r\n", + " inputfile = args.inputfile\r\n", + " outputfile = args.outputfile\r\n", "\r\n", - " # allocate a PYNQ buffer for the returned packed output buffer\r\n", - " obuf_packed = allocate(shape=oshape_packed, dtype=np.uint8)\r\n", + " # instantiate FINN accelerator driver and pass batchsize and bitfile\r\n", + " finnDriver = FINNAccelDriver(N, bitfile)\r\n", + "\r\n", + " # for the remote execution the data from the input npy file has to be loaded,\r\n", + " # packed and copied to the PYNQ buffer\r\n", + " if exec_mode == \"execute\":\r\n", + " # load desired input .npy file\r\n", + " ibuf_normal = np.load(inputfile)\r\n", + " ibuf_folded = finnDriver.fold_input(ibuf_normal)\r\n", + " ibuf_packed = finnDriver.pack_input(ibuf_folded)\r\n", + " finnDriver.copy_input_data_to_device(ibuf_packed)\r\n", + " elif exec_mode != \"throughput_test\":\r\n", + " raise Exception(\"Exec mode has to be set to remote_pynq or throughput_test\")\r\n", "\r\n", + " # for the throughput test the runtime of the network has to be measured\r\n", " if exec_mode == \"throughput_test\":\r\n", " # measure runtime of network\r\n", " start = time.time()\r\n", + " # dictionary for results of throughput test\r\n", + " res={}\r\n", "\r\n", - " # set up the DMA and wait until all transfers complete\r\n", - " dma.sendchannel.transfer(ibuf_packed_device)\r\n", - " dma.recvchannel.transfer(obuf_packed)\r\n", - " dma.sendchannel.wait()\r\n", - " dma.recvchannel.wait()\r\n", - "\r\n", + " # execute accelerator\r\n", + " finnDriver.execute()\r\n", "\r\n", + " # measure run time and fill dictionary with results of the throughput test\r\n", " if exec_mode == \"throughput_test\":\r\n", " end = time.time()\r\n", " runtime = end - start\r\n", " res[\"runtime[ms]\"] = runtime*1000\r\n", " res[\"throughput[images/s]\"] = N / runtime\r\n", + " res[\"DRAM_in_bandwidth[Mb/s]\"] = np.prod(finnDriver.ishape_packed)*0.000001 / runtime\r\n", + " res[\"DRAM_out_bandwidth[Mb/s]\"] = np.prod(finnDriver.oshape_packed)*0.000001 / runtime\r\n", " file = open(\"nw_metrics.txt\", \"w\")\r\n", " file.write(str(res))\r\n", " file.close()\r\n", "\r\n", + " # if execution is selected unpack, unfold and save output to output npy file\r\n", " else:\r\n", - " obuf_folded = unpack_output(obuf_packed, N)\r\n", - " save_output(obuf_folded, N)\r\n", + " obuf_folded = finnDriver.unpack_output(finnDriver.obuf_packed_device)\r\n", + " obuf_normal = finnDriver.unfold_output(obuf_folded)\r\n", + " np.save(outputfile, obuf_normal)\r\n", + "\r\n", "\r\n" ] } @@ -1579,7 +1601,21 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We can see that the generated driver contains the expected input/output shapes, expecting a file called `input.npy` to be provided prior to execution, which will be read in, packed into the format that the accelerator expects, running it and generating an `output.npy` file with the results. You can build your own applications around the accelerator by modifying the driver, or use the remote execution capabilities that FINN provides just to check if it is working, which will be our next step." + "We can see that in the generated driver a class is implemented which implements the FINN accelerator. The constructor gets the batchsize (N) as integer and the bitfile as string. It also contains the expected input/output shapes, and takes care of the instantiation of the accelerator by loading the bitfile and setting up dma and buffer. Several member functions take care of the data folding and packing. The function `copy_input_data_to_device` copies the input data into the PYNQ buffer and `execute` sets up the dma channels and waits until the transfer is completed. This class is used in the main function. But first the arguments are parsed, which are passed to the script. The driver can be used in two modes: \"execute\" and \"throughput_test\". By default all arguments are set to \"execute\" mode. In this mode the batch size is 1, and the passed files are set to the names used by the FINN transformations.\n", + "\n", + "In the \"execute\" mode works as follows:\n", + "1. the data is loaded from the \"inputfile\"\n", + "2. the data is folded using `fold_input`\n", + "3. the data is packed using `pack_input`\n", + "4. the data is copied to the device using `copy_input_data_to_device`\n", + "5. FINNAccelDriver is executed using `execute`\n", + "6. the data is unpacked using `unpack_output`\n", + "7. the data is unfolded using `unfold_output`\n", + "8. the data is stored in the \"outputfile\"\n", + "\n", + "If \"throughput_test\" is selected as `exec_mode`, no actual data needs to be loaded. The batchsize N should be set to a high value (i.e. 1000) and a time measurement is implemented in python. An empty dictionary (`res`) is created and after running the accelerator with the measured runtime it is filled with the metrics and saved in a .txt file.\n", + "\n", + "You can build your own applications around the accelerator by modifying the driver, or use the remote execution capabilities that FINN provides just to check if it is working, which will be our next step." ] }, { @@ -1593,15 +1629,15 @@ }, { "cell_type": "code", - "execution_count": 57, + "execution_count": 37, "metadata": {}, "outputs": [], "source": [ "from finn.transformation.fpgadataflow.make_deployment import DeployToPYNQ\n", - "ip = \"51.37.26.64\"\n", - "port = \"23\"\n", + "ip = \"192.168.3.1\"\n", + "port = \"22\"\n", "username = \"xilinx\"\n", - "password = \"x1l1nx_f1nn\"\n", + "password = \"xilinx\"\n", "target_dir = \"/home/xilinx/finn_tfc_end2end_example\"\n", "model = model.transform(DeployToPYNQ(ip, port, username, password, target_dir))\n", "model.save(build_dir + \"/tfc_w1_a1_pynq_deploy.onnx\")" @@ -1616,46 +1652,46 @@ }, { "cell_type": "code", - "execution_count": 58, + "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[key: \"vivado_stitch_proj\"\n", - "value: \"/tmp/finn_jakobap/vivado_stitch_proj_tqp4ib4j\"\n", + "value: \"/tmp/finn_dev_jakobap/vivado_stitch_proj_oa43bqzl\"\n", ", key: \"vivado_stitch_vlnv\"\n", "value: \"xilinx_finn:finn:finn_design:1.0\"\n", ", key: \"wrapper_filename\"\n", - "value: \"/tmp/finn_jakobap/vivado_stitch_proj_tqp4ib4j/finn_vivado_stitch_proj.srcs/sources_1/bd/finn_design/hdl/finn_design_wrapper.v\"\n", + "value: \"/tmp/finn_dev_jakobap/vivado_stitch_proj_oa43bqzl/finn_vivado_stitch_proj.srcs/sources_1/bd/finn_design/hdl/finn_design_wrapper.v\"\n", ", key: \"vivado_pynq_proj\"\n", - "value: \"/tmp/finn_jakobap/vivado_pynq_proj_gkwfg31j\"\n", + "value: \"/tmp/finn_dev_jakobap/vivado_pynq_proj_ljn53hfs\"\n", ", key: \"vivado_synth_rpt\"\n", - "value: \"/tmp/finn_jakobap/vivado_pynq_proj_gkwfg31j/synth_report.xml\"\n", + "value: \"/tmp/finn_dev_jakobap/vivado_pynq_proj_ljn53hfs/synth_report.xml\"\n", ", key: \"vivado_pynq_bitfile\"\n", - "value: \"/tmp/finn_jakobap/vivado_pynq_proj_gkwfg31j/resizer.bit\"\n", + "value: \"/tmp/finn_dev_jakobap/vivado_pynq_proj_ljn53hfs/resizer.bit\"\n", ", key: \"pynq_driver_dir\"\n", - "value: \"/tmp/finn_jakobap/pynq_driver_1r1_0kz6\"\n", + "value: \"/tmp/finn_dev_jakobap/pynq_driver_j_9suyqm\"\n", ", key: \"pynq_ip\"\n", - "value: \"51.37.26.64\"\n", + "value: \"51.37.47.42\"\n", ", key: \"pynq_port\"\n", "value: \"23\"\n", ", key: \"pynq_username\"\n", "value: \"xilinx\"\n", ", key: \"pynq_password\"\n", - "value: \"x1l1nx_f1nn\"\n", + "value: \"x1l1nx_f!nn\"\n", ", key: \"pynq_target_dir\"\n", "value: \"/home/xilinx/finn_tfc_end2end_example\"\n", ", key: \"pynq_deployment_dir\"\n", - "value: \"/tmp/finn_jakobap/pynq_deployment_kvurnk0c\"\n", + "value: \"/tmp/finn_dev_jakobap/pynq_deployment_962qxwkv\"\n", ", key: \"pynq_deploy_dir\"\n", - "value: \"/tmp/finn_jakobap/pynq_deployment_kvurnk0c\"\n", + "value: \"/tmp/finn_dev_jakobap/pynq_deployment_962qxwkv\"\n", ", key: \"exec_mode\"\n", "value: \"remote_pynq\"\n", "]" ] }, - "execution_count": 58, + "execution_count": 38, "metadata": {}, "output_type": "execute_result" } @@ -1666,18 +1702,63 @@ }, { "cell_type": "code", - "execution_count": 59, + "execution_count": 39, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "total 4284\r\n", + "/home/xilinx/finn_tfc_end2end_example/pynq_deployment_26e8h5jo:\r\n", + "total 4276\r\n", + "-rw-r--r-- 1 xilinx xilinx 6363 May 7 10:35 driver.py\r\n", + "drwxr-xr-x 4 xilinx xilinx 4096 May 7 10:35 finn\r\n", + "-rw-r--r-- 1 xilinx xilinx 3264 May 7 10:55 input.npy\r\n", + "-rw-r--r-- 1 root root 172 May 7 10:37 nw_metrics.txt\r\n", + "-rw-r--r-- 1 root root 120 May 7 10:55 output.npy\r\n", + "-rw-r--r-- 1 xilinx xilinx 4045675 May 7 10:35 resizer.bit\r\n", + "-rw-r--r-- 1 xilinx xilinx 302015 May 7 10:35 resizer.hwh\r\n", + "-rw-r--r-- 1 root root 32 May 7 10:55 sds_trace_data.dat\r\n", + "\r\n", + "/home/xilinx/finn_tfc_end2end_example/pynq_deployment_962qxwkv:\r\n", + "total 4260\r\n", + "-rw-r--r-- 1 xilinx xilinx 6363 May 7 17:44 driver.py\r\n", + "drwxr-xr-x 4 xilinx xilinx 4096 May 7 17:44 finn\r\n", + "-rw-r--r-- 1 xilinx xilinx 4045675 May 7 17:44 resizer.bit\r\n", + "-rw-r--r-- 1 xilinx xilinx 302015 May 7 17:44 resizer.hwh\r\n", + "\r\n", + "/home/xilinx/finn_tfc_end2end_example/pynq_deployment_kvurnk0c:\r\n", + "total 4300\r\n", "-rw-r--r-- 1 xilinx xilinx 3861 Apr 27 12:36 driver.py\r\n", "drwxr-xr-x 4 xilinx xilinx 4096 Apr 27 12:37 finn\r\n", + "-rw-r--r-- 1 xilinx xilinx 3264 Apr 27 12:37 input.npy\r\n", + "-rw-r--r-- 1 root root 78 Apr 27 12:38 nw_metrics.txt\r\n", + "-rw-r--r-- 1 root root 120 Apr 27 12:37 output.npy\r\n", "-rw-r--r-- 1 xilinx xilinx 4045675 Apr 27 12:36 resizer.bit\r\n", - "-rw-r--r-- 1 xilinx xilinx 329531 Apr 27 12:36 resizer.hwh\r\n" + "-rw-r--r-- 1 xilinx xilinx 329531 Apr 27 12:36 resizer.hwh\r\n", + "-rw-r--r-- 1 root root 32 Apr 27 12:38 sds_trace_data.dat\r\n", + "\r\n", + "/home/xilinx/finn_tfc_end2end_example/pynq_deployment__tnbutz_:\r\n", + "total 4276\r\n", + "-rw-r--r-- 1 xilinx xilinx 6363 May 6 17:34 driver.py\r\n", + "drwxr-xr-x 4 xilinx xilinx 4096 May 6 17:34 finn\r\n", + "-rw-r--r-- 1 xilinx xilinx 3264 May 6 17:34 input.npy\r\n", + "-rw-r--r-- 1 root root 173 May 6 17:35 nw_metrics.txt\r\n", + "-rw-r--r-- 1 root root 120 May 6 17:34 output.npy\r\n", + "-rw-r--r-- 1 xilinx xilinx 4045675 May 6 17:34 resizer.bit\r\n", + "-rw-r--r-- 1 xilinx xilinx 302015 May 6 17:34 resizer.hwh\r\n", + "-rw-r--r-- 1 root root 32 May 6 17:35 sds_trace_data.dat\r\n", + "\r\n", + "/home/xilinx/finn_tfc_end2end_example/pynq_deployment_w4aa1r9k:\r\n", + "total 4276\r\n", + "-rw-r--r-- 1 xilinx xilinx 6363 May 7 15:05 driver.py\r\n", + "drwxr-xr-x 4 xilinx xilinx 4096 May 7 15:05 finn\r\n", + "-rw-r--r-- 1 xilinx xilinx 3264 May 7 15:06 input.npy\r\n", + "-rw-r--r-- 1 root root 172 May 7 15:11 nw_metrics.txt\r\n", + "-rw-r--r-- 1 root root 120 May 7 15:06 output.npy\r\n", + "-rw-r--r-- 1 xilinx xilinx 4045675 May 7 15:05 resizer.bit\r\n", + "-rw-r--r-- 1 xilinx xilinx 302015 May 7 15:05 resizer.hwh\r\n", + "-rw-r--r-- 1 root root 32 May 7 15:11 sds_trace_data.dat\r\n" ] } ], @@ -1694,18 +1775,30 @@ }, { "cell_type": "code", - "execution_count": 60, + "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "<matplotlib.image.AxesImage at 0x7f4277550ef0>" + "<matplotlib.image.AxesImage at 0x7fe11dda48d0>" ] }, - "execution_count": 60, + "execution_count": 41, "metadata": {}, "output_type": "execute_result" + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAPsAAAD4CAYAAAAq5pAIAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+j8jraAAARX0lEQVR4nO3dfYyVZXrH8d/FoDAw8iYRCaisG/5QqmUbgk1KyOKmxlUMbKJm/aPauAmarMmqTVqz/UOSaqJVa/pH3YStL9CsmiWoq0a7a82mWo1GNFQQW1CULGR4E5H3t+HqH/NgZ3We6549z3nOc9z7+0kmM3Ouec65OTM/zsv13Pdt7i4Af/xGNT0AAJ1B2IFMEHYgE4QdyARhBzIxupM3Zma89Z+ZUaPKH09OnTpV23VXvf6enp6wPjAw0PJ1183dbbjLK4XdzK6U9M+SeiT9q7vfV+X6cmU27O/mS6k/6ip/eKNHx38CqcCk6r29vaW1Q4cOhcem9PX1hfUDBw6U1lIt50mTJoX1zz77LKx3o5afxptZj6R/kfR9SRdLusHMLm7XwAC0V5XX7PMlfeTuW9z9uKSnJS1pz7AAtFuVsM+Q9Lsh328rLvs9ZrbMzNaa2doKtwWgotrfoHP3FZJWSLxBBzSpyiP7dknnDfl+ZnEZgC5UJezvSJptZt8yszMl/VDS8+0ZFoB2a/lpvLufNLPbJP1ag623x9z9g7aNLCPjx48P6wcPHmz5useMGRPWjx07FtZTbcFx48aF9ai9lmoppqSOj9prqT76vn37WhpTN6v0mt3dX5L0UpvGAqBGnC4LZIKwA5kg7EAmCDuQCcIOZIKwA5mwTq4um+vpsqled6qXffTo0bA+duzYlo9Nia676vWfffbZYb3qNNLofp06dWp47O7du8N6amrwyZMnw3qdyuaz88gOZIKwA5kg7EAmCDuQCcIOZIKwA5mg9fYNkGrNVfkd1nnddUtNDa6yem1q6m5qanCTS03TegMyR9iBTBB2IBOEHcgEYQcyQdiBTBB2IBP02TvgrLPOCuvRbqOSNHHixLB+4sSJ0lpqN9LUFNbPP/88rC9YsCCs33rrraW1VC/6jjvuCOtbt24N601OM20SfXYgc4QdyARhBzJB2IFMEHYgE4QdyARhBzJBn/0b4JFHHgnrUS871Wuuuox1b29vWI+ktk2+5JJLwvqmTZvC+vHjx0trZ5xxRnhsdO6ClP53HzlyJKzXqazPXmnLZjP7VNIBSQOSTrr7vCrXB6A+lcJeWOTue9pwPQBqxGt2IBNVw+6SfmNm75rZsuF+wMyWmdlaM1tb8bYAVFD1afwCd99uZudIesXM/sfdXxv6A+6+QtIKiTfogCZVemR39+3F512SnpU0vx2DAtB+LYfdzMab2Vmnv5Z0haQN7RoYgPaq8jR+mqRniz7taElPuvu/t2VUf2RSWzYvWrQorF922WVhPeqVHzx4MDw21W/u6+sL66nzNKI566m11x999NGWr1uS7rzzztLaW2+9FR5b93bSTWg57O6+RdKftnEsAGpE6w3IBGEHMkHYgUwQdiAThB3IBFNcu0Bqqubs2bPD+v79+0trEyZMCI+NpoFK6SmwVbZ8TrX9UlJLcO/du7e0tnTp0vDYdevWhfVUSzLV8qwTS0kDmSPsQCYIO5AJwg5kgrADmSDsQCYIO5CJdiw42TFRT7fOfnBK6thU/ZZbbgnrq1atCuszZ85s+bZTffZ77rknrK9evTqsn3nmmaW1K664Ijz2wQcfDOuprbCj2168eHF47LZt28L6nj3fvDVWeWQHMkHYgUwQdiAThB3IBGEHMkHYgUwQdiATHZ/Pnup3Rzo51naqOvd54cKFYf2iiy4qrY0bNy48dvTo+FSLNWvWhPUtW7aE9SpSyz3PmTMnrKfu90jq75T57AC6FmEHMkHYgUwQdiAThB3IBGEHMkHYgUx0vM8+alT5/y9V54XXqcpc+lOnTlW67eg+S9VPnjwZHjt+/PiwfujQobCe2o46+p2l5tJfffXVYf3pp58O61X67Kk17VP3a5Na7rOb2WNmtsvMNgy5bIqZvWJmm4vPk9s5WADtN5Kn8U9IuvIrl90l6VV3ny3p1eJ7AF0sGXZ3f03SV/fRWSJpZfH1SknxXjoAGtfqGnTT3L2/+HqHpGllP2hmyyQta/F2ALRJ5QUn3d2jDRvdfYWkFRIbOwJNarX1ttPMpktS8XlX+4YEoA6thv15STcVX98k6VftGQ6AuiT77Gb2lKTvSpoqaaekuyU9J+mXks6XtFXS9e5evhn2/19XbU/jq64bX7UeSfVkU3uoR/uvV9Xb2xvWjxw5EtZT5wBUOcfgwgsvDOsff/xxy9edGldqTfqUw4cPVzq+irI+e/I1u7vfUFL6XqURAegoTpcFMkHYgUwQdiAThB3IBGEHMsGWzYVUC3JgYCCsR3p6esJ61WWHozZRqsWUmsKakrr+aNvkqCZJixYtamlMp0W/0xMnToTHpqa4Vvl7aAqP7EAmCDuQCcIOZIKwA5kg7EAmCDuQCcIOZKKr+ux1budcdTnnKuq+7QMHDpTWUv3iVK87dXyqTx8tF51axvq6664L60ePHg3rY8eOLa2l+uyp31mTWzK3ikd2IBOEHcgEYQcyQdiBTBB2IBOEHcgEYQcy0fE+ezS3u5t75dGSyanllFPq3Fb50ksvDY+dM2dOWE8tJf3cc8+F9UjUB5ekhQsXhvUqW3inlqGOzl2Qqi/B3QQe2YFMEHYgE4QdyARhBzJB2IFMEHYgE4QdyETH++zRnPU6++ipufKped1RT3j06PhuXLp0aVhPHb9kyZKwPmbMmNLa3Llzw2MnTZoU1lO97Ndff73l42fPnh0em1qbPdXrXr9+fWnt8ssvD4+N7lOpO/voKclHdjN7zMx2mdmGIZctN7PtZrau+Liq3mECqGokT+OfkHTlMJc/7O5zi4+X2jssAO2WDLu7vyZpbwfGAqBGVd6gu83M3i+e5k8u+yEzW2Zma81sbYXbAlBRq2H/maRvS5orqV/SQ2U/6O4r3H2eu89r8bYAtEFLYXf3ne4+4O6nJP1c0vz2DgtAu7UUdjObPuTbH0jaUPazALqDpfqoZvaUpO9Kmippp6S7i+/nSnJJn0q6xd37kzdmFt5Yqt+cmvcdmTVrVli/5pprwvrixYtLa6l516l526m509H+61K8hnlfX194bErVed3R7/SLL74Ij504cWJYT9m8eXNpbdWqVeGxDz1U+spUUnf32d192JNKkifVuPsNw1z8aOURAegoTpcFMkHYgUwQdiAThB3IBGEHMpFsvbX1xsw8Wna5zimud999d1hfvnx5WN+zZ09pberUqa0M6UuprYf37o2nJkT1Cy64IDw21RZMbdmccuzYsdJaahpp6u8h1YqNpi2ntlx++eWXw/rNN98c1pvc0rms9cYjO5AJwg5kgrADmSDsQCYIO5AJwg5kgrADmeh4nz2qV9maODXVMtX3rLLt8q5du8L61q1bw/oDDzwQ1levXh3W580rXwTo4YcfDo9Nbdk8eXLpimOSpG3btoX16Hf6xBNPhMd+8sknYf3aa68N69HU46rTa1988cWwnpoyXSf67EDmCDuQCcIOZIKwA5kg7EAmCDuQCcIOZKKjffZRo0Z5ND/6+PHj4fHnnHNOaW337t3hsak+e2rudNQvTm0HvWnTprA+ZcqUsJ5atjha7vn8888Pj03NZ08t771v376wfuONN5bWXnjhhfDYlNQ6AtFy0YsWLQqPTa0xkLpfUst/14k+O5A5wg5kgrADmSDsQCYIO5AJwg5kgrADmeiq+exVpPqeK1euDOvXX399y9d/+PDh8Nhx48aF9dS2yKl5/gMDA6W11Lrvb775Zlh/8sknw/q6devC+htvvFFaS51fkOrhp37n0Xkb8+fPD499++23w/rjjz8e1lPrytep5T67mZ1nZr81s41m9oGZ/aS4fIqZvWJmm4vP8SoHABo1kqfxJyX9jbtfLOnPJf3YzC6WdJekV919tqRXi+8BdKlk2N29393fK74+IOlDSTMkLZF0+rnxSklL6xokgOriFz1fYWazJH1H0tuSprl7f1HaIWlayTHLJC1rfYgA2mHE78abWZ+kNZJud/f9Q2s++C7fsG++ufsKd5/n7uWrIgKo3YjCbmZnaDDov3D3Z4qLd5rZ9KI+XVK8xCqARiVbbzY4f3OlpL3ufvuQyx+Q9Jm732dmd0ma4u5/m7iu8MbOPffccCw7duwI65Fo+15JmjlzZli/9957S2szZswIj01tuZzaujjaLlqS7r///tLaxo0bw2NTU1xT2yKnpKYtR1JtwxMnToT1aOpx6u9+woQJYb3qlOk6lbXeRvKa/S8k/ZWk9WZ2uqn6U0n3Sfqlmf1I0lZJcaMaQKOSYXf3/5JU9l/k99o7HAB14XRZIBOEHcgEYQcyQdiBTBB2IBMdneLa09PjUV83NVU06n3u37+/tCZJfX19YT3VN416vlX6vVK655s6RyDqZad6+MeOHQvrVUW/79Ryzampwam/lyq/s5SqY6sTS0kDmSPsQCYIO5AJwg5kgrADmSDsQCYIO5CJrlpKOjWHOOqlp5YVrjove/r06aW1/v7+0tpI9Pb2hvXUls11XndqGetDhw6F9SpzylNGjYofq6rMKW/6/IQq6LMDmSPsQCYIO5AJwg5kgrADmSDsQCYIO5CJruqzA6iOPjuQOcIOZIKwA5kg7EAmCDuQCcIOZIKwA5lIht3MzjOz35rZRjP7wMx+Uly+3My2m9m64uOq+ocLoFXJk2rMbLqk6e7+npmdJeldSUs1uB/7QXd/cMQ3xkk1QO3KTqoZyf7s/ZL6i68PmNmHkma0d3gA6vYHvWY3s1mSviPp7eKi28zsfTN7zMwmlxyzzMzWmtnaSiMFUMmIz403sz5J/ynpXnd/xsymSdojySX9gwaf6t+cuA6exgM1K3saP6Kwm9kZkl6U9Gt3/6dh6rMkvejuf5K4HsIO1KzliTA2uDzoo5I+HBr04o27034gaUPVQQKoz0jejV8g6XVJ6yWdXpv3p5JukDRXg0/jP5V0S/FmXnRdPLIDNav0NL5dCDtQP+azA5kj7EAmCDuQCcIOZIKwA5kg7EAmCDuQCcIOZIKwA5kg7EAmCDuQCcIOZIKwA5kg7EAmkgtOttkeSVuHfD+1uKwbdevYunVcEmNrVTvHdkFZoaPz2b9242Zr3X1eYwMIdOvYunVcEmNrVafGxtN4IBOEHchE02Ff0fDtR7p1bN06LomxtaojY2v0NTuAzmn6kR1AhxB2IBONhN3MrjSz/zWzj8zsribGUMbMPjWz9cU21I3uT1fsobfLzDYMuWyKmb1iZpuLz8PusdfQ2LpiG+9gm/FG77umtz/v+Gt2M+uRtEnSX0raJukdSTe4+8aODqSEmX0qaZ67N34ChpktlHRQ0qrTW2uZ2T9K2uvu9xX/UU5297/rkrEt1x+4jXdNYyvbZvyv1eB9187tz1vRxCP7fEkfufsWdz8u6WlJSxoYR9dz99ck7f3KxUskrSy+XqnBP5aOKxlbV3D3fnd/r/j6gKTT24w3et8F4+qIJsI+Q9Lvhny/Td2137tL+o2ZvWtmy5oezDCmDdlma4ekaU0OZhjJbbw76SvbjHfNfdfK9udV8Qbd1y1w9z+T9H1JPy6ernYlH3wN1k29059J+rYG9wDsl/RQk4MpthlfI+l2d98/tNbkfTfMuDpyvzUR9u2Szhvy/czisq7g7tuLz7skPavBlx3dZOfpHXSLz7saHs+X3H2nuw+4+ylJP1eD912xzfgaSb9w92eKixu/74YbV6futybC/o6k2Wb2LTM7U9IPJT3fwDi+xszGF2+cyMzGS7pC3bcV9fOSbiq+vknSrxocy+/plm28y7YZV8P3XePbn7t7xz8kXaXBd+Q/lvT3TYyhZFwXSvrv4uODpscm6SkNPq07ocH3Nn4k6WxJr0raLOk/JE3porH9mwa39n5fg8Ga3tDYFmjwKfr7ktYVH1c1fd8F4+rI/cbpskAmeIMOyARhBzJB2IFMEHYgE4QdyARhBzJB2IFM/B+tIjCppYWKvAAAAABJRU5ErkJggg==\n", + "text/plain": [ + "<Figure size 432x288 with 1 Axes>" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" } ], "source": [ @@ -1727,7 +1820,7 @@ }, { "cell_type": "code", - "execution_count": 61, + "execution_count": 42, "metadata": {}, "outputs": [], "source": [ @@ -1747,7 +1840,7 @@ }, { "cell_type": "code", - "execution_count": 62, + "execution_count": 43, "metadata": {}, "outputs": [], "source": [ @@ -1769,7 +1862,7 @@ }, { "cell_type": "code", - "execution_count": 63, + "execution_count": 44, "metadata": {}, "outputs": [ { @@ -1778,13 +1871,13 @@ "<BarContainer object of 10 artists>" ] }, - "execution_count": 63, + "execution_count": 44, "metadata": {}, "output_type": "execute_result" }, { "data": { - "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD4CAYAAAD8Zh1EAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjMsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+AADFEAAAMp0lEQVR4nO3cf6zdd13H8eeL1qoMgia7f2jbcRttMA2iI9cyJUHDZtJlpjVhJl0CYQbSmFCZQqKdmv1R/4Fhpv7RGJoxQxQsOPnj4qrVCP7hHyy9+xGgq43XOtdWDHeAYDRaGt7+0VNyvLvt/XY79572fZ+PZMn5fr+f3O/7bN0z336/95xUFZKkm9+rpj2AJGkyDLokNWHQJakJgy5JTRh0SWpi87ROfOutt9bs7Oy0Ti9JN6WnnnrqxaqaWenY1II+OzvLwsLCtE4vSTelJP96tWPecpGkJgy6JDVh0CWpCYMuSU0MCnqSPUnOJFlMcmiF4/cnWUry7Oif905+VEnStaz6Wy5JNgFHgJ8HzgMnk8xX1XPLln6qqg6uwYySpAGGXKHvBhar6mxVXQSOAfvWdixJ0vUaEvStwLmx7fOjfcu9I8kXkzyeZPtKPyjJgSQLSRaWlpZexriSpKuZ1EPRzwKzVfUm4G+Bj6+0qKqOVtVcVc3NzKz4QSdJ0ss05JOiF4DxK+5to33fVVVfG9t8FHj4lY+m5WYPPbHm53j+Q/es+TkkrY0hV+gngZ1JdiTZAuwH5scXJPmhsc29wOnJjShJGmLVK/SqupTkIHAC2AQ8VlWnkhwGFqpqHnh/kr3AJeDrwP1rOLMkaQWDvpyrqo4Dx5fte2js9YPAg5MdTZJ0PfykqCQ1YdAlqQmDLklNGHRJasKgS1ITBl2SmjDoktSEQZekJgy6JDVh0CWpCYMuSU0YdElqwqBLUhMGXZKaMOiS1IRBl6QmDLokNWHQJakJgy5JTRh0SWrCoEtSEwZdkpow6JLUhEGXpCYMuiQ1YdAlqQmDLklNGHRJasKgS1ITBl2SmjDoktSEQZekJgYFPcmeJGeSLCY5dI1170hSSeYmN6IkaYhVg55kE3AEuBvYBdyXZNcK614LPAA8OekhJUmrG3KFvhtYrKqzVXUROAbsW2Hd7wIfBv5ngvNJkgYaEvStwLmx7fOjfd+V5M3A9qp6YoKzSZKuwyt+KJrkVcAjwAcHrD2QZCHJwtLS0is9tSRpzJCgXwC2j21vG+274rXAG4G/T/I8cAcwv9KD0ao6WlVzVTU3MzPz8qeWJL3EkKCfBHYm2ZFkC7AfmL9ysKq+WVW3VtVsVc0CXwD2VtXCmkwsSVrRqkGvqkvAQeAEcBr4dFWdSnI4yd61HlCSNMzmIYuq6jhwfNm+h66y9ude+ViSpOvlJ0UlqQmDLklNGHRJasKgS1ITBl2SmjDoktSEQZekJgy6JDVh0CWpCYMuSU0YdElqwqBLUhMGXZKaMOiS1IRBl6QmDLokNWHQJakJgy5JTRh0SWrCoEtSEwZdkpow6JLUhEGXpCYMuiQ1YdAlqQmDLklNGHRJasKgS1ITBl2SmjDoktSEQZekJgy6JDVh0CWpiUFBT7InyZkki0kOrXD8V5J8KcmzSf4hya7JjypJupZVg55kE3AEuBvYBdy3QrA/WVU/XlU/CTwMPDLxSSVJ1zTkCn03sFhVZ6vqInAM2De+oKq+NbZ5C1CTG1GSNMTmAWu2AufGts8Db1m+KMn7gA8AW4C3r/SDkhwADgDcdttt1zurJOkaJvZQtKqOVNWPAL8J/M5V1hytqrmqmpuZmZnUqSVJDAv6BWD72Pa20b6rOQb84isZSpJ0/YYE/SSwM8mOJFuA/cD8+IIkO8c27wH+aXIjSpKGWPUeelVdSnIQOAFsAh6rqlNJDgMLVTUPHExyF/Bt4BvAu9dyaEnSSw15KEpVHQeOL9v30NjrByY8lyTpOvlJUUlqwqBLUhMGXZKaMOiS1IRBl6QmDLokNWHQJakJgy5JTRh0SWrCoEtSEwZdkpow6JLUhEGXpCYMuiQ1YdAlqQmDLklNGHRJasKgS1ITBl2SmjDoktSEQZekJgy6JDVh0CWpCYMuSU0YdElqwqBLUhMGXZKaMOiS1IRBl6QmDLokNWHQJakJgy5JTRh0SWpiUNCT7ElyJslikkMrHP9AkueSfDHJ3yV5/eRHlSRdy6pBT7IJOALcDewC7kuya9myZ4C5qnoT8Djw8KQHlSRd25Ar9N3AYlWdraqLwDFg3/iCqvp8Vf33aPMLwLbJjilJWs2QoG8Fzo1tnx/tu5r3AH+10oEkB5IsJFlYWloaPqUkaVUTfSia5J3AHPCRlY5X1dGqmququZmZmUmeWpI2vM0D1lwAto9tbxvt+3+S3AX8NvCzVfW/kxlPkjTUkCv0k8DOJDuSbAH2A/PjC5LcDnwU2FtVX538mJKk1awa9Kq6BBwETgCngU9X1akkh5PsHS37CPAa4M+TPJtk/io/TpK0RobccqGqjgPHl+17aOz1XROeS5J0nfykqCQ1YdAlqQmDLklNGHRJasKgS1ITBl2SmjDoktSEQZekJgy6JDVh0CWpCYMuSU0YdElqwqBLUhMGXZKaMOiS1IRBl6QmDLokNWHQJakJgy5JTRh0SWrCoEtSEwZdkpow6JLUhEGXpCYMuiQ1YdAlqQmDLklNGHRJasKgS1ITBl2SmjDoktSEQZekJgy6JDUxKOhJ9iQ5k2QxyaEVjr8tydNJLiW5d/JjSpJWs2rQk2wCjgB3A7uA+5LsWrbsBeB+4JOTHlCSNMzmAWt2A4tVdRYgyTFgH/DclQVV9fzo2HfWYEZJ0gBDbrlsBc6NbZ8f7btuSQ4kWUiysLS09HJ+hCTpKtb1oWhVHa2quaqam5mZWc9TS1J7Q4J+Adg+tr1ttE+SdAMZEvSTwM4kO5JsAfYD82s7liTpeq0a9Kq6BBwETgCngU9X1akkh5PsBUjyU0nOA78EfDTJqbUcWpL0UkN+y4WqOg4cX7bvobHXJ7l8K0aSNCV+UlSSmjDoktSEQZekJgy6JDUx6KGoJK2n2UNPrPk5nv/QPWt+jvVm0DWI/4NJNz5vuUhSEzflFbpXi5L0Ul6hS1ITBl2SmjDoktTETXkPXdLa81nVzceg66aw1nExLOrAWy6S1IRBl6QmvOUi3cC81aTrYdClVRhV3Sy85SJJTRh0SWrCoEtSE95Dv05+2ELSjcqgS9KYm/mizVsuktSEQZekJgy6JDVh0CWpCYMuSU0YdElqwqBLUhMGXZKaMOiS1IRBl6QmDLokNTEo6En2JDmTZDHJoRWOf2+ST42OP5lkdtKDSpKubdWgJ9kEHAHuBnYB9yXZtWzZe4BvVNWPAr8PfHjSg0qSrm3IFfpuYLGqzlbVReAYsG/Zmn3Ax0evHwfuTJLJjSlJWk2q6toLknuBPVX13tH2u4C3VNXBsTVfHq05P9r+59GaF5f9rAPAgdHmG4Azk3ojA9wKvLjqqn583xuL77u/11fVzEoH1vX70KvqKHB0Pc95RZKFqpqbxrmnyfe9sfi+N7Yht1wuANvHtreN9q24Jslm4HXA1yYxoCRpmCFBPwnsTLIjyRZgPzC/bM088O7R63uBz9Vq93IkSRO16i2XqrqU5CBwAtgEPFZVp5IcBhaqah74GPAnSRaBr3M5+jeaqdzquQH4vjcW3/cGtupDUUnSzcFPikpSEwZdkppoH/TVvragoyTbk3w+yXNJTiV5YNozrackm5I8k+Qvpz3LekryA0keT/KPSU4n+elpz7Qekvz66M/5l5P8WZLvm/ZM09I66AO/tqCjS8AHq2oXcAfwvg3yvq94ADg97SGm4A+Bv66qHwN+gg3w7yDJVuD9wFxVvZHLv7hxI/5SxrpoHXSGfW1BO1X1lap6evT6P7n8P/bW6U61PpJsA+4BHp32LOspyeuAt3H5N86oqotV9R/TnWrdbAa+f/QZmFcD/zbleaame9C3AufGts+zQcJ2xeibL28HnpzuJOvmD4DfAL4z7UHW2Q5gCfjj0e2mR5PcMu2h1lpVXQB+D3gB+Arwzar6m+lONT3dg76hJXkN8BfAr1XVt6Y9z1pL8gvAV6vqqWnPMgWbgTcDf1RVtwP/BbR/ZpTkB7n8t+4dwA8DtyR553Snmp7uQR/ytQUtJfkeLsf8E1X1mWnPs07eCuxN8jyXb6+9PcmfTnekdXMeOF9VV/4m9jiXA9/dXcC/VNVSVX0b+AzwM1OeaWq6B33I1xa0M/rq4o8Bp6vqkWnPs16q6sGq2lZVs1z+b/25qtoQV2tV9e/AuSRvGO26E3huiiOtlxeAO5K8evTn/k42wMPgq1nXb1tcb1f72oIpj7Ue3gq8C/hSkmdH+36rqo5PcSatvV8FPjG6eDkL/PKU51lzVfVkkseBp7n8213PsIG/BsCP/ktSE91vuUjShmHQJakJgy5JTRh0SWrCoEtSEwZdkpow6JLUxP8B9uoCk0KMtNwAAAAASUVORK5CYII=\n", + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD4CAYAAAD8Zh1EAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+j8jraAAAMp0lEQVR4nO3cf6zdd13H8eeL1qoMgia7f2jbcRttMA2iI9cyJUHDZtJlpjVhJl0CYQbSmFCZQqKdmv1R/4Fhpv7RGJoxQxQsOPnj4qrVCP7hHyy9+xGgq43XOtdWDHeAYDRaGt7+0VNyvLvt/XY79572fZ+PZMn5fr+f3O/7bN0z336/95xUFZKkm9+rpj2AJGkyDLokNWHQJakJgy5JTRh0SWpi87ROfOutt9bs7Oy0Ti9JN6WnnnrqxaqaWenY1II+OzvLwsLCtE4vSTelJP96tWPecpGkJgy6JDVh0CWpCYMuSU0MCnqSPUnOJFlMcmiF4/cnWUry7Oif905+VEnStaz6Wy5JNgFHgJ8HzgMnk8xX1XPLln6qqg6uwYySpAGGXKHvBhar6mxVXQSOAfvWdixJ0vUaEvStwLmx7fOjfcu9I8kXkzyeZPtKPyjJgSQLSRaWlpZexriSpKuZ1EPRzwKzVfUm4G+Bj6+0qKqOVtVcVc3NzKz4QSdJ0ss05JOiF4DxK+5to33fVVVfG9t8FHj4lY+m5WYPPbHm53j+Q/es+TkkrY0hV+gngZ1JdiTZAuwH5scXJPmhsc29wOnJjShJGmLVK/SqupTkIHAC2AQ8VlWnkhwGFqpqHnh/kr3AJeDrwP1rOLMkaQWDvpyrqo4Dx5fte2js9YPAg5MdTZJ0PfykqCQ1YdAlqQmDLklNGHRJasKgS1ITBl2SmjDoktSEQZekJgy6JDVh0CWpCYMuSU0YdElqwqBLUhMGXZKaMOiS1IRBl6QmDLokNWHQJakJgy5JTRh0SWrCoEtSEwZdkpow6JLUhEGXpCYMuiQ1YdAlqQmDLklNGHRJasKgS1ITBl2SmjDoktSEQZekJgYFPcmeJGeSLCY5dI1170hSSeYmN6IkaYhVg55kE3AEuBvYBdyXZNcK614LPAA8OekhJUmrG3KFvhtYrKqzVXUROAbsW2Hd7wIfBv5ngvNJkgYaEvStwLmx7fOjfd+V5M3A9qp6YoKzSZKuwyt+KJrkVcAjwAcHrD2QZCHJwtLS0is9tSRpzJCgXwC2j21vG+274rXAG4G/T/I8cAcwv9KD0ao6WlVzVTU3MzPz8qeWJL3EkKCfBHYm2ZFkC7AfmL9ysKq+WVW3VtVsVc0CXwD2VtXCmkwsSVrRqkGvqkvAQeAEcBr4dFWdSnI4yd61HlCSNMzmIYuq6jhwfNm+h66y9ude+ViSpOvlJ0UlqQmDLklNGHRJasKgS1ITBl2SmjDoktSEQZekJgy6JDVh0CWpCYMuSU0YdElqwqBLUhMGXZKaMOiS1IRBl6QmDLokNWHQJakJgy5JTRh0SWrCoEtSEwZdkpow6JLUhEGXpCYMuiQ1YdAlqQmDLklNGHRJasKgS1ITBl2SmjDoktSEQZekJgy6JDVh0CWpiUFBT7InyZkki0kOrXD8V5J8KcmzSf4hya7JjypJupZVg55kE3AEuBvYBdy3QrA/WVU/XlU/CTwMPDLxSSVJ1zTkCn03sFhVZ6vqInAM2De+oKq+NbZ5C1CTG1GSNMTmAWu2AufGts8Db1m+KMn7gA8AW4C3r/SDkhwADgDcdttt1zurJOkaJvZQtKqOVNWPAL8J/M5V1hytqrmqmpuZmZnUqSVJDAv6BWD72Pa20b6rOQb84isZSpJ0/YYE/SSwM8mOJFuA/cD8+IIkO8c27wH+aXIjSpKGWPUeelVdSnIQOAFsAh6rqlNJDgMLVTUPHExyF/Bt4BvAu9dyaEnSSw15KEpVHQeOL9v30NjrByY8lyTpOvlJUUlqwqBLUhMGXZKaMOiS1IRBl6QmDLokNWHQJakJgy5JTRh0SWrCoEtSEwZdkpow6JLUhEGXpCYMuiQ1YdAlqQmDLklNGHRJasKgS1ITBl2SmjDoktSEQZekJgy6JDVh0CWpCYMuSU0YdElqwqBLUhMGXZKaMOiS1IRBl6QmDLokNWHQJakJgy5JTRh0SWpiUNCT7ElyJslikkMrHP9AkueSfDHJ3yV5/eRHlSRdy6pBT7IJOALcDewC7kuya9myZ4C5qnoT8Djw8KQHlSRd25Ar9N3AYlWdraqLwDFg3/iCqvp8Vf33aPMLwLbJjilJWs2QoG8Fzo1tnx/tu5r3AH+10oEkB5IsJFlYWloaPqUkaVUTfSia5J3AHPCRlY5X1dGqmququZmZmUmeWpI2vM0D1lwAto9tbxvt+3+S3AX8NvCzVfW/kxlPkjTUkCv0k8DOJDuSbAH2A/PjC5LcDnwU2FtVX538mJKk1awa9Kq6BBwETgCngU9X1akkh5PsHS37CPAa4M+TPJtk/io/TpK0RobccqGqjgPHl+17aOz1XROeS5J0nfykqCQ1YdAlqQmDLklNGHRJasKgS1ITBl2SmjDoktSEQZekJgy6JDVh0CWpCYMuSU0YdElqwqBLUhMGXZKaMOiS1IRBl6QmDLokNWHQJakJgy5JTRh0SWrCoEtSEwZdkpow6JLUhEGXpCYMuiQ1YdAlqQmDLklNGHRJasKgS1ITBl2SmjDoktSEQZekJgy6JDUxKOhJ9iQ5k2QxyaEVjr8tydNJLiW5d/JjSpJWs2rQk2wCjgB3A7uA+5LsWrbsBeB+4JOTHlCSNMzmAWt2A4tVdRYgyTFgH/DclQVV9fzo2HfWYEZJ0gBDbrlsBc6NbZ8f7btuSQ4kWUiysLS09HJ+hCTpKtb1oWhVHa2quaqam5mZWc9TS1J7Q4J+Adg+tr1ttE+SdAMZEvSTwM4kO5JsAfYD82s7liTpeq0a9Kq6BBwETgCngU9X1akkh5PsBUjyU0nOA78EfDTJqbUcWpL0UkN+y4WqOg4cX7bvobHXJ7l8K0aSNCV+UlSSmjDoktSEQZekJgy6JDUx6KGoJK2n2UNPrPk5nv/QPWt+jvVm0DWI/4NJNz5vuUhSEzflFbpXi5L0Ul6hS1ITBl2SmjDoktTETXkPXdLa81nVzceg66aw1nExLOrAWy6S1IRBl6QmvOUi3cC81aTrYdClVRhV3Sy85SJJTRh0SWrCoEtSE95Dv05+2ELSjcqgS9KYm/mizVsuktSEQZekJgy6JDVh0CWpCYMuSU0YdElqwqBLUhMGXZKaMOiS1IRBl6QmDLokNTEo6En2JDmTZDHJoRWOf2+ST42OP5lkdtKDSpKubdWgJ9kEHAHuBnYB9yXZtWzZe4BvVNWPAr8PfHjSg0qSrm3IFfpuYLGqzlbVReAYsG/Zmn3Ax0evHwfuTJLJjSlJWk2q6toLknuBPVX13tH2u4C3VNXBsTVfHq05P9r+59GaF5f9rAPAgdHmG4Azk3ojA9wKvLjqqn583xuL77u/11fVzEoH1vX70KvqKHB0Pc95RZKFqpqbxrmnyfe9sfi+N7Yht1wuANvHtreN9q24Jslm4HXA1yYxoCRpmCFBPwnsTLIjyRZgPzC/bM088O7R63uBz9Vq93IkSRO16i2XqrqU5CBwAtgEPFZVp5IcBhaqah74GPAnSRaBr3M5+jeaqdzquQH4vjcW3/cGtupDUUnSzcFPikpSEwZdkppoH/TVvragoyTbk3w+yXNJTiV5YNozrackm5I8k+Qvpz3LekryA0keT/KPSU4n+elpz7Qekvz66M/5l5P8WZLvm/ZM09I66AO/tqCjS8AHq2oXcAfwvg3yvq94ADg97SGm4A+Bv66qHwN+gg3w7yDJVuD9wFxVvZHLv7hxI/5SxrpoHXSGfW1BO1X1lap6evT6P7n8P/bW6U61PpJsA+4BHp32LOspyeuAt3H5N86oqotV9R/TnWrdbAa+f/QZmFcD/zbleaame9C3AufGts+zQcJ2xeibL28HnpzuJOvmD4DfAL4z7UHW2Q5gCfjj0e2mR5PcMu2h1lpVXQB+D3gB+Arwzar6m+lONT3dg76hJXkN8BfAr1XVt6Y9z1pL8gvAV6vqqWnPMgWbgTcDf1RVtwP/BbR/ZpTkB7n8t+4dwA8DtyR553Snmp7uQR/ytQUtJfkeLsf8E1X1mWnPs07eCuxN8jyXb6+9PcmfTnekdXMeOF9VV/4m9jiXA9/dXcC/VNVSVX0b+AzwM1OeaWq6B33I1xa0M/rq4o8Bp6vqkWnPs16q6sGq2lZVs1z+b/25qtoQV2tV9e/AuSRvGO26E3huiiOtlxeAO5K8evTn/k42wMPgq1nXb1tcb1f72oIpj7Ue3gq8C/hSkmdH+36rqo5PcSatvV8FPjG6eDkL/PKU51lzVfVkkseBp7n8213PsIG/BsCP/ktSE91vuUjShmHQJakJgy5JTRh0SWrCoEtSEwZdkpow6JLUxP8B9uoCk0KMtNwAAAAASUVORK5CYII=\n", "text/plain": [ "<Figure size 432x288 with 1 Axes>" ] @@ -1820,20 +1913,23 @@ "source": [ "### Throughput Test on PYNQ Board <a id='throughput'></a>\n", "In addition to the functional verification, FINN also offers the possibility to measure the network performance directly on the PYNQ board. This can be done using the core function `throughput_test`. In the next section we import the function and execute it.\n", - "First we extract the `remote_exec_model` again and pass it to the function. The function returns the metrics of the network as dictionary." + "First we extract the `remote_exec_model` again and pass it to the function. The function returns the metrics of the network as dictionary. " ] }, { "cell_type": "code", - "execution_count": 66, + "execution_count": 45, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "Network metrics: \n", - "{'runtime[ms]': 3.5953521728515625, 'throughput[images/s]': 278136.8700265252}\n" + "Network metrics:\n", + "runtime[ms]: 1.4772415161132812\n", + "throughput[images/s]: 676937.378954164\n", + "DRAM_in_bandwidth[Mb/s]: 75.81698644286635\n", + "DRAM_out_bandwidth[Mb/s]: 27.07749515816656\n" ] } ], @@ -1842,15 +1938,49 @@ "\n", "child_model = ModelWrapper(getCustomOp(sdp_node).get_nodeattr(\"model\"))\n", "res = throughput_test(child_model)\n", - "print(\"Network metrics: \\n\" + str(res))" + "print(\"Network metrics:\")\n", + "for key in res:\n", + " print(str(key) + \": \" + str(res[key]))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Together with the values for folding we can evaluate the performance of our accelerator. Each layer has a total folding factor of 64 and because the network is fully pipelined, it follows: `II = 64`. II is the initiation interval and indicates how many cycles are needed for one input to be processed. " ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 46, "metadata": {}, - "outputs": [], - "source": [] + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "We reach approximately 43% of the ideal performance.\n" + ] + } + ], + "source": [ + "II = 64\n", + "# frequency in MHz\n", + "f_MHz = 100\n", + "# expected throughput in MFPS\n", + "expected_throughput = f_MHz / II\n", + "# measured throughput (FPS) from throughput test, converted to MFPS\n", + "measured_throughput = res[\"throughput[images/s]\"] * 0.000001\n", + "# peformance\n", + "print(\"We reach approximately \" + str(round((measured_throughput / expected_throughput)*100)) + \"% of the ideal performance.\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The measured values were recorded with a batch size of 1000 and at a frequency of 100 MHz. We will be improving the efficiency of the generated accelerator examples in the coming FINN releases." + ] } ], "metadata": { diff --git a/notebooks/end2end_example/tfc_end2end_verification.ipynb b/notebooks/end2end_example/tfc_end2end_verification.ipynb index f03add2da..09b115fa4 100644 --- a/notebooks/end2end_example/tfc_end2end_verification.ipynb +++ b/notebooks/end2end_example/tfc_end2end_verification.ipynb @@ -9,7 +9,7 @@ "\n", "**Important: This notebook depends on the tfc_end2end_example notebook, because we are using models that are available at intermediate steps in the end-to-end flow. So please make sure the needed .onnx files are generated to run this notebook.**\n", "\n", - "In this notebook, we will show how to take the intermediate results of the end-to-end tfc example and verify their functionality with different methods. In the following picture you can see the block in the end-to-end flow about the *Simulation & Emulation Flows*. Besides the methods in this notebook, there is another one that is covered in the Jupyter notebook [tfc_end2end_example](tfc_end2end_example.ipynb): remote execution. The remote execution allows functional verification directly on the PYNQ board, for details please have a look at the mentioned Jupyter notebook." + "In this notebook, we will show how to take the intermediate results of the end-to-end tfc example and verify their functionality with different methods. In the following picture you can see the section in the end-to-end flow about the *Simulation & Emulation Flows*. Besides the methods in this notebook, there is another one that is covered in the Jupyter notebook [tfc_end2end_example](tfc_end2end_example.ipynb): remote execution. The remote execution allows functional verification directly on the PYNQ board, for details please have a look at the mentioned Jupyter notebook." ] }, { @@ -32,18 +32,9 @@ "metadata": {}, "outputs": [], "source": [ - "import inspect\n", - "import netron\n", "from finn.util.basic import make_build_dir\n", - "from IPython.display import IFrame\n", - "\n", - "def showSrc(what):\n", - " print(\"\".join(inspect.getsourcelines(what)[0]))\n", - " \n", - "def showInNetron(model_filename):\n", - " netron.start(model_filename, port=8081, host=\"0.0.0.0\")\n", - " return IFrame(src=\"http://0.0.0.0:8081/\", width=\"100%\", height=400)\n", - " \n", + "from finn.util.visualization import showSrc, showInNetron\n", + " \n", "build_dir = \"/workspace/finn\"" ] }, @@ -51,7 +42,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "To verify the simulations a \"golden\" output is calculated as a reference. This is calculated directly from the Brevitas model using PyTorch, by running some example data from the MNIST dataset through the trained model." + "To verify the simulations, a \"golden\" output is calculated as a reference. This is calculated directly from the Brevitas model using PyTorch, by running some example data from the MNIST dataset through the trained model." ] }, { @@ -62,8 +53,8 @@ { "data": { "text/plain": [ - "array([[-0.4992097 , -0.24960485, 6.489726 , 0.99841946, -0.24960482,\n", - " -2.2464437 , 0.7488146 , -1.4976292 , -0.49920973, -2.7456534 ]],\n", + "array([[-1.119972 , -1.7596636, 0.8423852, -1.0705007, -1.3218282,\n", + " -1.5030646, -1.4598225, -1.2803943, -1.0334575, -1.7878995]],\n", " dtype=float32)" ] }, @@ -91,9 +82,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Simulation using Python \n", + "## Simulation using Python <a id='simpy'></a>\n", "\n", - "If an ONNX model consists of [standard ONNX](https://github.com/onnx/onnx/blob/master/docs/Operators.md) nodes and/or FINN custom operations that do not belong to the fpgadataflow (backend $\\neq$ \"fpgadataflow\") this model can be checked for functionality using Python. General information about FINN custom op nodes can be found in Jupyter notebook [2_custom_op.ipynb](../internals/2_custom_op.ipynb).\n", + "If an ONNX model consists of [standard ONNX](https://github.com/onnx/onnx/blob/master/docs/Operators.md) nodes and/or FINN custom operations that do not belong to the fpgadataflow (backend $\\neq$ \"fpgadataflow\") this model can be checked for functionality using Python.\n", "\n", "To simulate a standard ONNX node [onnxruntime](https://github.com/microsoft/onnxruntime) is used. onnxruntime is an open source tool developed by Microsoft to run standard ONNX nodes. For the FINN custom op nodes execution functions are defined. The following is an example of the execution function of a XNOR popcount node.\n" ] @@ -111,8 +102,10 @@ " \"\"\"Simulates XNOR-popcount matrix multiplication as a regular bipolar\n", " matrix multiplication followed by some post processing.\"\"\"\n", " # extract the operand shapes\n", - " (M, K0) = inp0.shape\n", - " (K1, N) = inp1.shape\n", + " # (M, K0) = inp0.shape\n", + " # (K1, N) = inp1.shape\n", + " K0 = inp0.shape[-1]\n", + " K1 = inp1.shape[0]\n", " # make sure shapes are compatible with matmul\n", " assert K0 == K1, \"Matrix shapes are not compatible with matmul.\"\n", " K = K0\n", @@ -202,7 +195,7 @@ "source": [ "## Simulation (npysim) using C++\n", "\n", - "When dealing with HLS custom op nodes in FINN the simulation using Python is no longer sufficient. After the nodes have been converted to HLS layers, the simulation using C++ can be used. To do this, the input tensor is stored in an .npy file and C++ code is generated that reads the values from the .npy array, streams them to the corresponding finn-hlslib function and writes the result to a new .npy file. This in turn can be read in Python and processed in the FINN flow. For this example the model after the conversion to HLS layers is used." + "When dealing with HLS custom op nodes in FINN the simulation using Python is no longer sufficient. After the nodes have been converted to HLS layers, the simulation using C++ can be used. To do this, the input tensor is stored in an .npy file and C++ code is generated that reads the values from the .npy array, streams them to the corresponding finn-hlslib function and writes the result to a new .npy file. This in turn can be read in Python and processed in the FINN flow. For this example the model after setting the folding factors in the HLS layers is used, please be aware that this is not the full model, but the dataflow partition, so before executing at the end of this section we have to integrate the model back into the parent model." ] }, { @@ -211,7 +204,7 @@ "metadata": {}, "outputs": [], "source": [ - "model_for_npysim = ModelWrapper(build_dir+\"/tfc_w1_a1_hls_layers.onnx\")" + "model_for_npysim = ModelWrapper(build_dir+\"/tfc_w1_a1_set_folding_factors.onnx\")" ] }, { @@ -231,7 +224,9 @@ "source": [ "from finn.transformation.fpgadataflow.codegen_npysim import CodeGen_npysim\n", "from finn.transformation.fpgadataflow.compile import Compile\n", + "from finn.transformation.general import GiveUniqueNodeNames\n", "\n", + "model_for_npysim = model_for_npysim.transform(GiveUniqueNodeNames())\n", "model_for_npysim = model_for_npysim.transform(CodeGen_npysim())\n", "model_for_npysim = model_for_npysim.transform(Compile())" ] @@ -269,7 +264,7 @@ " " ], "text/plain": [ - "<IPython.lib.display.IFrame at 0x7fb461dd6710>" + "<IPython.lib.display.IFrame at 0x7f8dfdb29c18>" ] }, "execution_count": 8, @@ -302,14 +297,15 @@ "name": "stdout", "output_type": "stream", "text": [ - "compile.sh execute_StreamingFCLayer_Batch.cpp\tnode_model params.h thresh.h\r\n" + "compile.sh\t\t\t memblock_0.dat thresh.h\r\n", + "execute_StreamingFCLayer_Batch.cpp node_model\t weights.npy\r\n" ] } ], "source": [ "from finn.custom_op.registry import getCustomOp\n", "\n", - "fc0 = model_for_npysim.graph.node[2]\n", + "fc0 = model_for_npysim.graph.node[1]\n", "fc0w = getCustomOp(fc0)\n", "code_gen_dir = fc0w.get_nodeattr(\"code_gen_dir_npysim\")\n", "!ls {code_gen_dir}" @@ -337,14 +333,15 @@ "source": [ "from finn.transformation.fpgadataflow.set_exec_mode import SetExecMode\n", "\n", - "model_for_npysim = model_for_npysim.transform(SetExecMode(\"npysim\"))" + "model_for_npysim = model_for_npysim.transform(SetExecMode(\"npysim\"))\n", + "model_for_npysim.save(build_dir+\"/tfc_w1_a1_for_npysim.onnx\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Now the model can be executed using `execute_onnx`. The function reads the `exec_mode` and writes the input into the correct directory in a .npy file. To be able to read this in C++, there is an additional .hpp file ([npy2apintstream.hpp](https://github.com/Xilinx/finn/blob/master/src/finn/data/cpp/npy2apintstream.hpp)) in FINN, which uses cnpy to read .npy files and convert them into streams, or to read a stream and write it into an .npy. [cnpy](https://github.com/rogersce/cnpy) is a helper to read and write .npy and .npz formates in C++.\n", + "Before the model can be executed using `execute_onnx`, we integrate the child model in the parent model. The function reads then the `exec_mode` and writes the input into the correct directory in a .npy file. To be able to read this in C++, there is an additional .hpp file ([npy2apintstream.hpp](https://github.com/Xilinx/finn/blob/master/src/finn/data/cpp/npy2apintstream.hpp)) in FINN, which uses cnpy to read .npy files and convert them into streams, or to read a stream and write it into an .npy. [cnpy](https://github.com/rogersce/cnpy) is a helper to read and write .npy and .npz formates in C++.\n", "\n", "The result is again compared to the \"golden\" output." ] @@ -363,7 +360,11 @@ } ], "source": [ - "output_dict = oxe.execute_onnx(model_for_npysim, input_dict)\n", + "parent_model = ModelWrapper(build_dir+\"/tfc_w1_a1_dataflow_parent.onnx\")\n", + "sdp_node = parent_model.graph.node[2]\n", + "child_model = build_dir + \"/tfc_w1_a1_for_npysim.onnx\"\n", + "getCustomOp(sdp_node).set_nodeattr(\"model\", child_model)\n", + "output_dict = oxe.execute_onnx(parent_model, input_dict)\n", "output_npysim = output_dict[list(output_dict.keys())[0]]\n", "\n", "if np.isclose(output_npysim, output_golden, atol=1e-3).all():\n", @@ -398,7 +399,7 @@ "source": [ "### Emulation of model node-by-node\n", "\n", - "The child model is loaded and the `exec_mode` for each node is set. Then it is saved in a new .onnx file so that the changed model can be referenced in the parent model." + "The child model is loaded and the `exec_mode` for each node is set. To prepare the node-by-node emulation the transformation `PrepareRTLSim` is applied to the child model. With this transformation the emulation files are created for each node and can be used directly when calling `execute_onnx()`. Each node has a new node attribute \"rtlsim_so\" after transformation, which contains the path to the corresponding emulation files. Then it is saved in a new .onnx file so that the changed model can be referenced in the parent model." ] }, { @@ -407,8 +408,10 @@ "metadata": {}, "outputs": [], "source": [ + "from finn.transformation.fpgadataflow.prepare_rtlsim import PrepareRTLSim\n", "child_model = ModelWrapper(build_dir + \"/tfc_w1_a1_ipgen.onnx\")\n", "child_model = child_model.transform(SetExecMode(\"rtlsim\"))\n", + "child_model = child_model.transform(PrepareRTLSim())\n", "child_model.save(build_dir + \"/tfc_w1_a1_dataflow_child.onnx\")" ] }, @@ -519,13 +522,6 @@ "else:\n", " print(\"The results are not the same!\")" ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] } ], "metadata": { -- GitLab