[notebook - code gen]current work status on compilation part of the notebook

502cc526 · auphelia · 9b1c8482 · 502cc526 · 502cc526
Commit 502cc526 authored 5 years ago by auphelia
--- a/notebooks/FCLayer_graph.onnx
+++ b/notebooks/FCLayer_graph.onnx
--- a/notebooks/FINN-CodeGenerationAndCompilation.ipynb
+++ b/notebooks/FINN-CodeGenerationAndCompilation.ipynb
@@ -245,7 +245,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 12,
+   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
@@ -283,7 +283,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
@@ -547,6 +547,101 @@
    "### Compilation"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<font size=\"3\">The compilation is a transformation pass like the code generation. The code of this transformation is shown below. </font>"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "class Compile(Transformation):\n",
+      "    \"\"\"Compile for all nodes in model\"\"\"\n",
+      "\n",
+      "    def __init__(self):\n",
+      "        super().__init__()\n",
+      "\n",
+      "    def apply(self, model):\n",
+      "        for node in model.graph.node:\n",
+      "            op_type = node.op_type\n",
+      "            if node.domain == \"finn\":\n",
+      "                backend_attribute = util.get_by_name(node.attribute, \"backend\")\n",
+      "                if backend_attribute is None:\n",
+      "                    continue\n",
+      "                backend_value = backend_attribute.s.decode(\"UTF-8\")\n",
+      "                if backend_value == \"fpgadataflow\":\n",
+      "                    try:\n",
+      "                        # lookup op_type in registry of CustomOps\n",
+      "                        inst = registry.custom_op[op_type](node)\n",
+      "                        # ensure that code is generated\n",
+      "                        assert inst.get_nodeattr(\"code_gen_dir\") != \"\"\n",
+      "                        # call the compilation function for this node\n",
+      "                        inst.compile_singlenode_code()\n",
+      "                        # ensure that executable path is now set\n",
+      "                        assert inst.get_nodeattr(\"executable_path\") != \"\"\n",
+      "                    except KeyError:\n",
+      "                        # exception if op_type is not supported\n",
+      "                        raise Exception(\n",
+      "                            \"Custom op_type %s is currently not supported.\" % op_type\n",
+      "                        )\n",
+      "        return (model, False)\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "from finn.transformation.fpgadataflow.compile import Compile\n",
+    "showSrc(Compile)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<font size=\"3\">The scheme resembles that of the code generation transformation pass. The pass iterates over all nodes in the model and if `domain=\"finn\"` and `backend=\"fpgadataflow\"` is True, the compilation is activated for that node. First an instance of the node is created and checked whether the code was generated. For this the node attribute `code_gen_dir` is checked. If it exists, the function `compile_singlenode_code()` can be executed. Then it is checked whether the path to the executable has been set. There is an exception if the custom op_type is not supported. \n",
+    "\n",
+    "The actual compilation is done with the function `compile_singlenode_code()`. What happens inside the function is shown below.</font>"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "    def compile_singlenode_code(self):\n",
+      "        code_gen_dir = self.get_nodeattr(\"code_gen_dir\")\n",
+      "        builder = CppBuilder()\n",
+      "        builder.append_includes(\"-I/workspace/finn/src/finn/data/cpp\")\n",
+      "        builder.append_includes(\"-I/workspace/cnpy/\")\n",
+      "        builder.append_includes(\"-I/workspace/finn-hlslib\")\n",
+      "        builder.append_includes(\"-I/workspace/vivado-hlslib\")\n",
+      "        builder.append_includes(\"--std=c++11\")\n",
+      "        builder.append_sources(code_gen_dir + \"/*.cpp\")\n",
+      "        builder.append_sources(\"/workspace/cnpy/cnpy.cpp\")\n",
+      "        builder.append_includes(\"-lz\")\n",
+      "        builder.set_executable_path(code_gen_dir + \"/node_model\")\n",
+      "        builder.build(code_gen_dir)\n",
+      "        self.set_nodeattr(\"executable_path\", builder.executable_path)\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "showSrc(StreamingFCLayer_Batch.compile_singlenode_code)"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": null,

 %% Cell type:markdown id: tags:

 # FINN - Code Generation and Compilation
 -----------------------------------------------------------------
 <font size="3">This notebook is about code generation and compilation to enable execution of FINN custom operation nodes.

 Following showSrc function is used to print the source code of function calls in the Jupyter notebook:</font>

 %% Cell type:code id: tags:

 ``` python
 import inspect

 def showSrc(what):
    print("".join(inspect.getsourcelines(what)[0]))
 ```

 %% Cell type:markdown id: tags:

 ## Outline
 -------------
 * <font size="3">Example model</font>
 * <font size="3">Code generation</font>
 * <font size="3">Compilation</font>

 %% Cell type:markdown id: tags:

 ### Example model
 <font size="3">To show the code generation and compilation of a node, an example model with a streaming fclayer node is first created. To learn more about FINN custom operation nodes, please take a look at notebook [FINN-CustomOps](FINN-CustomOps.ipynb).

 First TensorProto and helper are imported from ONNX. These functions can be used to create tensors, nodes, graphs and models in ONNX. Additional functions from `util` and the classes `DataType` and `ModelWrapper` are needed. More information about `DataType` and `ModelWrapper` can be found in Jupyter notebook [FINN-ModelWrapper](FINN-ModelWrapper.ipynb).</font>

 %% Cell type:code id: tags:

 ``` python
 from onnx import TensorProto, helper
 import finn.core.utils as util
 from finn.core.datatype import DataType
 from finn.core.modelwrapper import ModelWrapper
 ```

 %% Cell type:markdown id: tags:

 <font size="3">Then all parameters, that are needed to create a streaming fclayer, are set. To keep the example clear small values are chosen. For more information about the parameters please take look at the documentation of the [finn-hls library](https://finn-hlslib.readthedocs.io/en/latest/).</font>

 %% Cell type:code id: tags:

 ``` python
 idt = wdt = odt = DataType.BIPOLAR
 mw = 8
 mh = 8
 pe = 4
 simd = 4
 nf = mh // pe
 sf = mw // simd
 ```

 %% Cell type:markdown id: tags:

 <font size="3">A `tensor_value_info` is created for all tensors involved. In this case there is one tensor for the weights besides the input and output tensors. Then an input list is created containing the two inputs (`"inp"`and `"weights"`).</font>

 %% Cell type:code id: tags:

 ``` python
 inp = helper.make_tensor_value_info("inp", TensorProto.FLOAT, [1, sf, simd])
 weights = helper.make_tensor_value_info("weights", TensorProto.FLOAT, [mw, mh])
 outp = helper.make_tensor_value_info("outp", TensorProto.FLOAT, [1, nf, pe])
 node_inp_list = ["inp", "weights"]
 ```

 %% Cell type:markdown id: tags:

 <font size="3">Now the node can be created. The operation type is set to `"StreamingFCLayer_Batch"` and the rest of the attributes are set appropriately. The relevant attributes for the activation of the code generation and compilation are:</font>
 * <font size="3">**`domain="finn"`**: specifies that the created node is a FINN-Custom Op</font>
 * <font size="3">**`backend="fpgadataflow"`**: specifies that it is a node that corresponds to a function in the finn-hls library</font>
 * <font size="3">**`code_gen_dir"`**: specifies the path to the directory where the generated c++ files are (is set during code generation)</font>
 * <font size="3">**`executable_path"`**: specifies the path to the executable created after compilation (is set during compilation)</font>

 %% Cell type:code id: tags:

 ``` python
 FCLayer_node = helper.make_node(
        "StreamingFCLayer_Batch",
        node_inp_list,
        ["outp"],
        domain="finn",
        backend="fpgadataflow",
        code_gen_dir="",
        executable_path="",
        resType="ap_resource_lut()",
        MW=mw,
        MH=mh,
        SIMD=simd,
        PE=pe,
        noActivation=1,
        binaryXnorMode=1,
        inputDataType=idt.name,
        weightDataType=wdt.name,
        outputDataType=odt.name,
 )
 ```

 %% Cell type:markdown id: tags:

 <font size="3"> The node is packed into a graph environment and the inputs and outputs are set.</font>

 %% Cell type:code id: tags:

 ``` python
 graph = helper.make_graph(
        nodes=[FCLayer_node], name="fclayer_graph", inputs=[inp], outputs=[outp]
    )
 ```

 %% Cell type:markdown id: tags:

 <font size="3">A model is now created from the graph, which is then converted into a ModelWrapper object for further processing in FINN. Afterwards the ModelWrapper internal functions can be used to set the FINN data types and the initializer for the weights. Since this is an example, the weights are not taken from the training, but random values are generated using the utility function `gen_finn_dt_tensor()`. This function gets a FINN datatype and a shape and generates a tensor with values of this datatype in the desired shape.</font>

 %% Cell type:code id: tags:

 ``` python
 model = helper.make_model(graph, producer_name="fclayer-model")
 model = ModelWrapper(model)

 model.set_tensor_datatype("inp", idt)
 model.set_tensor_datatype("outp", odt)
 model.set_tensor_datatype("weights", wdt)
 W = util.gen_finn_dt_tensor(wdt, (mw, mh))
 model.set_initializer("weights", W)
 ```

 %% Cell type:markdown id: tags:

 <font size="3">The model is saved and then netron is used to visualize the resulting model. </font>

 %% Cell type:code id: tags:

 ``` python
 model.save("FCLayer_graph.onnx")
 ```

 %% Cell type:code id: tags:

 ``` python
 import netron
 netron.start('FCLayer_graph.onnx', port=8081, host="0.0.0.0")
 ```

 %% Output

    Serving 'FCLayer_graph.onnx' at http://0.0.0.0:8081

 %% Cell type:code id: tags:

 ``` python
 %%html
 <iframe src="http://0.0.0.0:8081/" style="position: relative; width: 100%;" height="400"></iframe>
 ```

 %% Output


 %% Cell type:markdown id: tags:

 ### Code Generation
 <font size="3">Code generation is a transformation that can be applied to the model. For more information about transformation passes, see Jupyter Notebook [FINN-HowToTransformPass](FINN-HowToTransformPass.ipynb).

 The code generation transformation is shown below.</font>

 %% Cell type:code id: tags:

 ``` python
 from finn.transformation.fpgadataflow.codegen import CodeGen
 showSrc(CodeGen)
 ```

 %% Output

    class CodeGen(Transformation):
        """Code generation for all nodes in model"""
    
        def apply(self, model):
            for node in model.graph.node:
                if node.domain == "finn":
                    backend_attribute = get_by_name(node.attribute, "backend")
                    if backend_attribute is None:
                        continue
                    backend_value = backend_attribute.s.decode("UTF-8")
                    if backend_value == "fpgadataflow":
                        _codegen_single_node(node, model)
            return (model, False)
    

 %% Cell type:markdown id: tags:

 <font size="3">The transformation pass iterates over all nodes in the model and if `domain="finn"` and `backend="fpgadataflow"` is True, the function `_codegen_single_node()` is executed which is also part of the transformation pass and is shown below. </font>

 %% Cell type:code id: tags:

 ``` python
 from finn.transformation.fpgadataflow.codegen import _codegen_single_node
 showSrc(_codegen_single_node)
 ```

 %% Output

    def _codegen_single_node(node, model):
        """Call custom implementation to generate code for single custom node
        and create folder that contains all the generated files"""
        op_type = node.op_type
        try:
            # lookup op_type in registry of CustomOps
            inst = registry.custom_op[op_type](node)
            # get the path of the code generation directory
            code_gen_dir = inst.get_nodeattr("code_gen_dir")
            # ensure that there is a directory
            if code_gen_dir == "" or not os.path.isdir(code_gen_dir):
                code_gen_dir = tmp.mkdtemp(prefix="code_gen_" + str(node.op_type) + "_")
                inst.set_nodeattr("code_gen_dir", code_gen_dir)
            # ensure that there is generated code inside the dir
            inst.code_generation(model)
        except KeyError:
            # exception if op_type is not supported
            raise Exception("Custom op_type %s is currently not supported." % op_type)
    

 %% Cell type:markdown id: tags:

 <font size="3">An instance of the node is created and checked for the attribute `code_gen_dir`. If the attribute is not set, a temporary directory is created and the attribute is set accordingly.

 Then the `code_generation()` function of the instance is called. If an error occurs during this process, this is probably due to the fact that the selected CustomOp is not yet supported. The following description of the code generation within the CustomOp instance may lead to overlaps with the Jupyter notebook [FINN-CustomOps](FINN-CustomOps.ipynb).

 In order to clarify the individual components involved in code generation, an instance of the node is first created, as in the `_codegen_single_node` function. This is done by looking up the op_type in the [registry](https://github.com/Xilinx/finn/blob/dev/src/finn/custom_op/registry.py) of CustomOps. The instance contains a template for code generation which is shown below.</font>

 %% Cell type:code id: tags:

 ``` python
 import finn.custom_op.registry as registry
 node = FCLayer_node
 op_type = FCLayer_node.op_type
 inst = registry.custom_op[op_type](node)
 print(inst.docompute_template)
 ```

 %% Output

    
            #include "cnpy.h"
            #include "npy2apintstream.hpp"
            #include <vector>
            #include "bnn-library.h"
    
            // includes for network parameters
            $GLOBALS$
    
            // defines for network parameters
            $DEFINES$
    
            int main(){
    
            $STREAMDECLARATIONS$
    
            $READNPYDATA$
    
            $DOCOMPUTE$
    
            $DATAOUTSTREAM$
    
            $SAVEASCNPY$
    
            }
    
    

 %% Cell type:markdown id: tags:

 <font size="3">The template has some general constructs, like the inclusion of bnn-library.h, which contains the references to the finn-hls library, and of cnpy.h and npy2apintstream.hpp, which support the transfer of python numpy arrays in c++. The idea of this template is to replace the variables marked with `$ $` with c++ calls during code generation. Then the template can be written into a .cpp file and be compiled.

 The sub-functions that are called during code generation are shown below.</font>

 %% Cell type:code id: tags:

 ``` python
 from finn.custom_op.fpgadataflow.streamingfclayer_batch import StreamingFCLayer_Batch
 showSrc(StreamingFCLayer_Batch.code_generation)
 ```

 %% Output

        def code_generation(self, model):
            node = self.onnx_node
            self.generate_params(model)
            self.global_includes()
            self.defines()
            self.read_npy_data()
            self.strm_decl()
            self.docompute()
            self.dataoutstrm()
            self.save_as_npy()
    
            template = self.docompute_template
    
            for key in self.code_gen_dict:
                # transform list into long string separated by '\n'
                code_gen_line = "\n".join(self.code_gen_dict[key])
                template = template.replace(key, code_gen_line)
            code_gen_dir = self.get_nodeattr("code_gen_dir")
            f = open(os.path.join(code_gen_dir, "execute_{}.cpp".format(node.op_type)), "w")
            f.write(template)
            f.close()
    

 %% Cell type:markdown id: tags:

 <font size="3">Except for the function `generate_params(model)` all functions needed to fill the template correspond to the `$ $` variable names, i.e. function `defines()` returns the part of the c++ code that replaces `$DEFINES$` in the template. The individual functions are member functions of the class HLSCustomOp and are defined in each CustomOp. The code for a StreamingFCLayer_Batch node can be looked up in the [code](https://github.com/Xilinx/finn/blob/dev/src/finn/custom_op/fpgadataflow/streamingfclayer_batch.py).</font>

 %% Cell type:markdown id: tags:

 <font size="3">A special function for code generation for the StreamingFCLayer_Batch node is the `generate_params(model)` function. Besides the normal input tensor, a fc layer has weight values as input and can get additional thresholds for activation. This function reads the values for the weights and thresholds via the `get_initializer` function of the ModelWrapper and writes them c++ conform in .h files, which are added to the includes.

 The `generate_params` function of the StreamingFCLayer_Batch is shown below.</font>

 %% Cell type:code id: tags:

 ``` python
 showSrc(StreamingFCLayer_Batch.generate_params)
 ```

 %% Output

        def generate_params(self, model):
            # weights
            weights = model.get_initializer(self.onnx_node.input[1])
            # convert weights into hlslib-compatible format
            weight_tensor = self.get_hls_compatible_weight_tensor(weights)
            export_wdt = self.get_weight_datatype()
            # we have converted bipolar weights to binary for export,
            # so use it as such for weight generation
            if self.get_weight_datatype() == DataType.BIPOLAR:
                export_wdt = DataType.BINARY
            weight_hls_code = numpy_to_hls_code(
                weight_tensor, export_wdt, "weights", True, True
            )
            # write weights into params.h
            code_gen_dir = self.get_nodeattr("code_gen_dir")
            f_weights = open("{}/params.h".format(code_gen_dir), "w")
    
            if export_wdt.bitwidth() != 1:
                f_weights.write(
                    "static FixedPointWeights<{},{},{},{}> weights = ".format(
                        self.get_nodeattr("SIMD"),
                        export_wdt.get_hls_datatype_str(),
                        self.get_nodeattr("PE"),
                        self.calc_wmem(),
                    )
                )
            else:
                f_weights.write(
                    "static BinaryWeights<{},{},{}> weights = ".format(
                        self.get_nodeattr("SIMD"), self.get_nodeattr("PE"), self.calc_wmem()
                    )
                )
            f_weights.write(weight_hls_code)
            f_weights.close()
            # thresholds
            if len(self.onnx_node.input) > 2:
                thresholds = model.get_initializer(self.onnx_node.input[2])
                if thresholds is not None:
                    threshold_tensor = self.get_hls_compatible_threshold_tensor(thresholds)
                    tdt = DataType.INT32
                    # use UINT32 threshold export for bipolar times bipolar
                    inp_is_bipolar = self.get_input_datatype() == DataType.BIPOLAR
                    wt_is_bipolar = self.get_weight_datatype() == DataType.BIPOLAR
                    # reinterpret inp/wt as bipolar if bin_xnor_mode is iset
                    inp_is_binary = self.get_input_datatype() == DataType.BINARY
                    wt_is_binary = self.get_weight_datatype() == DataType.BINARY
                    bin_xnor_mode = self.get_nodeattr("binaryXnorMode") == 1
                    inp_is_bipolar = inp_is_bipolar or (inp_is_binary and bin_xnor_mode)
                    wt_is_bipolar = wt_is_bipolar or (wt_is_binary and bin_xnor_mode)
                    if inp_is_bipolar and wt_is_bipolar:
                        tdt = DataType.UINT32
                    thresholds_hls_code = numpy_to_hls_code(
                        threshold_tensor, tdt, "thresholds", False, True
                    )
                    # write thresholds into thresh.h
                    code_gen_dir = self.get_nodeattr("code_gen_dir")
                    f_thresh = open("{}/thresh.h".format(code_gen_dir), "w")
                    tdt_hls = tdt.get_hls_datatype_str()
                    # use binary to export bipolar activations
                    export_odt = self.get_output_datatype()
                    if self.get_output_datatype() == DataType.BIPOLAR:
                        export_odt = DataType.BINARY
                    odt_hls = export_odt.get_hls_datatype_str()
                    f_thresh.write(
                        "static ThresholdsActivation<{},{},{},{},{},{},{}> threshs \
                         = ".format(
                            self.calc_tmem(),
                            self.get_nodeattr("PE"),
                            threshold_tensor.shape[-1],
                            tdt_hls,
                            odt_hls,
                            self.get_nodeattr("ActVal"),
                            "std::less_equal<%s>" % tdt_hls,
                        )
                    )
                    f_thresh.write(thresholds_hls_code)
                    f_thresh.close()
    

 %% Cell type:markdown id: tags:

 <font size="3">The generated code is written to the previously created temporary directory and the node attribute `code_gen_dir` is set. This completes the code generation for executing a single CustomOp and the next step is compilation. </font>

 %% Cell type:markdown id: tags:

 ### Compilation

+%% Cell type:markdown id: tags:
+
+<font size="3">The compilation is a transformation pass like the code generation. The code of this transformation is shown below. </font>
+
+%% Cell type:code id: tags:
+
+``` python
+from finn.transformation.fpgadataflow.compile import Compile
+showSrc(Compile)
+```
+
+%% Output
+
+    class Compile(Transformation):
+        """Compile for all nodes in model"""
+    
+        def __init__(self):
+            super().__init__()
+    
+        def apply(self, model):
+            for node in model.graph.node:
+                op_type = node.op_type
+                if node.domain == "finn":
+                    backend_attribute = util.get_by_name(node.attribute, "backend")
+                    if backend_attribute is None:
+                        continue
+                    backend_value = backend_attribute.s.decode("UTF-8")
+                    if backend_value == "fpgadataflow":
+                        try:
+                            # lookup op_type in registry of CustomOps
+                            inst = registry.custom_op[op_type](node)
+                            # ensure that code is generated
+                            assert inst.get_nodeattr("code_gen_dir") != ""
+                            # call the compilation function for this node
+                            inst.compile_singlenode_code()
+                            # ensure that executable path is now set
+                            assert inst.get_nodeattr("executable_path") != ""
+                        except KeyError:
+                            # exception if op_type is not supported
+                            raise Exception(
+                                "Custom op_type %s is currently not supported." % op_type
+                            )
+            return (model, False)
+    
+
+%% Cell type:markdown id: tags:
+
+<font size="3">The scheme resembles that of the code generation transformation pass. The pass iterates over all nodes in the model and if `domain="finn"` and `backend="fpgadataflow"` is True, the compilation is activated for that node. First an instance of the node is created and checked whether the code was generated. For this the node attribute `code_gen_dir` is checked. If it exists, the function `compile_singlenode_code()` can be executed. Then it is checked whether the path to the executable has been set. There is an exception if the custom op_type is not supported.
+
+The actual compilation is done with the function `compile_singlenode_code()`. What happens inside the function is shown below.</font>
+
+%% Cell type:code id: tags:
+
+``` python
+showSrc(StreamingFCLayer_Batch.compile_singlenode_code)
+```
+
+%% Output
+
+        def compile_singlenode_code(self):
+            code_gen_dir = self.get_nodeattr("code_gen_dir")
+            builder = CppBuilder()
+            builder.append_includes("-I/workspace/finn/src/finn/data/cpp")
+            builder.append_includes("-I/workspace/cnpy/")
+            builder.append_includes("-I/workspace/finn-hlslib")
+            builder.append_includes("-I/workspace/vivado-hlslib")
+            builder.append_includes("--std=c++11")
+            builder.append_sources(code_gen_dir + "/*.cpp")
+            builder.append_sources("/workspace/cnpy/cnpy.cpp")
+            builder.append_includes("-lz")
+            builder.set_executable_path(code_gen_dir + "/node_model")
+            builder.build(code_gen_dir)
+            self.set_nodeattr("executable_path", builder.executable_path)
+    
+
 %% Cell type:code id: tags:

 ``` python
 ```