[Notebook] re-exec notebooks with latest StreamingFC changes

f7f1ada0 · Yaman Umuroglu · ea2f7515 · f7f1ada0 · f7f1ada0 · f7f1ada0
Commit f7f1ada0 authored 5 years ago by Yaman Umuroglu
--- a/notebooks/FCLayer_graph.onnx
+++ b/notebooks/FCLayer_graph.onnx
--- a/notebooks/FINN-CodeGenerationAndCompilation.ipynb
+++ b/notebooks/FINN-CodeGenerationAndCompilation.ipynb
@@ -13,7 +13,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 12,
+   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -45,7 +45,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -64,7 +64,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 14,
+   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -73,7 +73,6 @@
    "mh = 8\n",
    "pe = 4\n",
    "simd = 4\n",
-    "wmem = mw * mh // (pe * simd)\n",
    "nf = mh // pe\n",
    "sf = mw // simd\n"
   ]
@@ -87,7 +86,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 15,
+   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -110,7 +109,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 16,
+   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -127,8 +126,8 @@
    "        MH=mh,\n",
    "        SIMD=simd,\n",
    "        PE=pe,\n",
-    "        WMEM=wmem,\n",
-    "        TMEM=0,\n",
+    "        noActivation=1,\n",
+    "        binaryXnorMode=1,\n",
    "        inputDataType=idt.name,\n",
    "        weightDataType=wdt.name,\n",
    "        outputDataType=odt.name,\n",
@@ -144,7 +143,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 17,
+   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -162,7 +161,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 18,
+   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -185,7 +184,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 19,
+   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -194,15 +193,13 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 20,
+   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "\n",
-      "Stopping http://0.0.0.0:8081\n",
      "Serving 'FCLayer_graph.onnx' at http://0.0.0.0:8081\n"
     ]
    }
@@ -214,7 +211,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 21,
+   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
@@ -247,7 +244,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 22,
+   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
@@ -261,6 +258,8 @@
      "        for node in model.graph.node:\n",
      "            if node.domain == \"finn\":\n",
      "                backend_attribute = get_by_name(node.attribute, \"backend\")\n",
+      "                if backend_attribute is None:\n",
+      "                    continue\n",
      "                backend_value = backend_attribute.s.decode(\"UTF-8\")\n",
      "                if backend_value == \"fpgadataflow\":\n",
      "                    _codegen_single_node(node, model)\n",
@@ -283,7 +282,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 23,
+   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {

 %% Cell type:markdown id: tags:

 # FINN - Code Generation and Compilation
 -----------------------------------------------------------------
 <font size="3">This notebook is about code generation and compilation to enable execution of FINN custom operation nodes.

 Following showSrc function is used to print the source code of function calls in the Jupyter notebook:</font>

 %% Cell type:code id: tags:

 ``` python
 import inspect

 def showSrc(what):
    print("".join(inspect.getsourcelines(what)[0]))
 ```

 %% Cell type:markdown id: tags:

 ## Outline
 -------------
 * <font size="3">Example model</font>
 * <font size="3">Code generation</font>

 %% Cell type:markdown id: tags:

 ### Example model
 <font size="3">To show the code generation and compilation of a node, an example model with a streaming fclayer node is first created. To learn more about FINN custom operation nodes, please take a look in notebook *FINN-CustomOps*.

 First TensorProto and helper are imported from ONNX. These functions can be used to create tensors, nodes, graphs and models in ONNX. Additional functions from `util` and the classes `DataType` and `ModelWrapper` are needed. More information about `DataType` and `ModelWrapper` can be found in Jupyter notebook *FINN-ModelWrapper*.</font>

 %% Cell type:code id: tags:

 ``` python
 from onnx import TensorProto, helper
 import finn.core.utils as util
 from finn.core.datatype import DataType
 from finn.core.modelwrapper import ModelWrapper
 ```

 %% Cell type:markdown id: tags:

 <font size="3">Then all parameters, that are needed to create a streaming fclayer, are set. To keep the example clear small values are chosen. </font>

 %% Cell type:code id: tags:

 ``` python
 idt = wdt = odt = DataType.BIPOLAR
 mw = 8
 mh = 8
 pe = 4
 simd = 4
-wmem = mw * mh // (pe * simd)
 nf = mh // pe
 sf = mw // simd
 ```

 %% Cell type:markdown id: tags:

 <font size="3">A `tensor_value_info` is created for all tensors involved. In this case there is one tensor for the weights besides the input and output tensors. Then an input list is created containing the two inputs (`"inp"`and `"weights"`).</font>

 %% Cell type:code id: tags:

 ``` python
 inp = helper.make_tensor_value_info("inp", TensorProto.FLOAT, [1, sf, simd])
 weights = helper.make_tensor_value_info("weights", TensorProto.FLOAT, [mw, mh])
 outp = helper.make_tensor_value_info("outp", TensorProto.FLOAT, [1, nf, pe])
 node_inp_list = ["inp", "weights"]
 ```

 %% Cell type:markdown id: tags:

 <font size="3">Now the node can be created. The operation type is set to `"StreamingFCLayer_Batch"` and the rest of the attributes are set appropriately. The relevant attributes for the activation of the code generation and compilation are:</font>
 * <font size="3">**`domain="finn"`**: specifies that the created node is a FINN-Custom Op</font>
 * <font size="3">**`backend="fpgadataflow"`**: specifies that it is a node that corresponds to a function in the finn-hls library</font>
 * <font size="3">**`code_gen_dir"`**: specifies the path to the directory where the generated c++ files are (is set during code generation)</font>
 * <font size="3">**`executable_path"`**: specifies the path to the executable created after compilation (is set during compilation)</font>

 %% Cell type:code id: tags:

 ``` python
 FCLayer_node = helper.make_node(
        "StreamingFCLayer_Batch",
        node_inp_list,
        ["outp"],
        domain="finn",
        backend="fpgadataflow",
        code_gen_dir="",
        executable_path="",
        resType="ap_resource_lut()",
        MW=mw,
        MH=mh,
        SIMD=simd,
        PE=pe,
-        WMEM=wmem,
-        TMEM=0,
+        noActivation=1,
+        binaryXnorMode=1,
        inputDataType=idt.name,
        weightDataType=wdt.name,
        outputDataType=odt.name,
 )
 ```

 %% Cell type:markdown id: tags:

 <font size="3"> The node is packed into a graph environment and the inputs and outputs are set.</font>

 %% Cell type:code id: tags:

 ``` python
 graph = helper.make_graph(
        nodes=[FCLayer_node], name="fclayer_graph", inputs=[inp], outputs=[outp]
    )
 ```

 %% Cell type:markdown id: tags:

 <font size="3">A model is now created from the graph, which is then converted into a ModelWrapper object for further processing in FINN. Afterwards the ModelWrapper internal functions can be used to set the FINN data types and the initializer for the weights. Since this is an example, the weights are not taken from the training, but random values are generated using the utility function `gen_finn_dt_tensor()`. This function gets a FINN datatype and a shape and generates a tensor with values of this datatype in the desired shape.</font>

 %% Cell type:code id: tags:

 ``` python
 model = helper.make_model(graph, producer_name="fclayer-model")
 model = ModelWrapper(model)

 model.set_tensor_datatype("inp", idt)
 model.set_tensor_datatype("outp", odt)
 model.set_tensor_datatype("weights", wdt)
 W = util.gen_finn_dt_tensor(wdt, (mw, mh))
 model.set_initializer("weights", W)
 ```

 %% Cell type:markdown id: tags:

 <font size="3">The model is saved and then netron is used to visualize the resulting model. </font>

 %% Cell type:code id: tags:

 ``` python
 model.save("FCLayer_graph.onnx")
 ```

 %% Cell type:code id: tags:

 ``` python
 import netron
 netron.start('FCLayer_graph.onnx', port=8081, host="0.0.0.0")
 ```

 %% Output

-    
-    Stopping http://0.0.0.0:8081
    Serving 'FCLayer_graph.onnx' at http://0.0.0.0:8081

 %% Cell type:code id: tags:

 ``` python
 %%html
 <iframe src="http://0.0.0.0:8081/" style="position: relative; width: 100%;" height="400"></iframe>
 ```

 %% Output


 %% Cell type:markdown id: tags:

 ### Code Generation
 <font size="3">Code generation is a transformation that can be applied to the model. For more information about transformation passes, see Jupyter Notebook *FINN-HowToTransformPass*.

 The code generation transformation is shown below.</font>

 %% Cell type:code id: tags:

 ``` python
 from finn.transformation.fpgadataflow.codegen import CodeGen
 showSrc(CodeGen)
 ```

 %% Output

    class CodeGen(Transformation):
        """Code generation for all nodes in model"""
    
        def apply(self, model):
            for node in model.graph.node:
                if node.domain == "finn":
                    backend_attribute = get_by_name(node.attribute, "backend")
+                    if backend_attribute is None:
+                        continue
                    backend_value = backend_attribute.s.decode("UTF-8")
                    if backend_value == "fpgadataflow":
                        _codegen_single_node(node, model)
            return (model, False)
    

 %% Cell type:markdown id: tags:

 <font size="3">The transformation passes iterates over all nodes in the model and if `domain="finn"` and `backend="fpgadataflow"` the function `_codegen_single_node()` is executed which is also part of the transformation pass and is shown below. </font>

 %% Cell type:code id: tags:

 ``` python
 from finn.transformation.fpgadataflow.codegen import _codegen_single_node
 showSrc(_codegen_single_node)
 ```

 %% Output

    def _codegen_single_node(node, model):
        """Call custom implementation to generate code for single custom node
        and create folder that contains all the generated files"""
        op_type = node.op_type
        try:
            # lookup op_type in registry of CustomOps
            inst = registry.custom_op[op_type](node)
            # get the path of the code generation directory
            code_gen_dir = inst.get_nodeattr("code_gen_dir")
            # ensure that there is a directory
            if code_gen_dir == "" or not os.path.isdir(code_gen_dir):
                code_gen_dir = tmp.mkdtemp(prefix="code_gen_" + str(node.op_type) + "_")
                inst.set_nodeattr("code_gen_dir", code_gen_dir)
            # ensure that there is generated code inside the dir
            inst.code_generation(model)
        except KeyError:
            # exception if op_type is not supported
            raise Exception("Custom op_type %s is currently not supported." % op_type)
    

 %% Cell type:markdown id: tags:

 <font size="3">An instance of the node is created and checked for the attribute `code_gen_dir`. If the attribute is not set, a temporary directory is created and the attribute is set accordingly.

 Then the `code_generation()` function of the instance is called. If an error occurs during this process, this is probably due to the fact that the selected CustomOp is not yet supported.</font>

 %% Cell type:markdown id: tags:

 <font size="3">The following description of the code generation within the CustomOp instance may lead to overlaps with the Jupyter notebook *FINN-CustomOps*. </font>

 %% Cell type:code id: tags:

 ``` python
 ```

--- a/notebooks/FINN-CustomOps.ipynb
+++ b/notebooks/FINN-CustomOps.ipynb
@@ -18,7 +18,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 8,
+   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -56,12 +56,12 @@
    "    MH=mh,\n",
    "    SIMD=simd,\n",
    "    PE=pe,\n",
-    "    WMEM=wmem,\n",
-    "    TMEM=tmem,\n",
-    "    inputDataType=FINN-DataType,\n",
-    "    weightDataType=FINN-DataType,\n",
-    "    outputDataType=FINN-DataType,\n",
+    "    inputDataType=<FINN DataType>,\n",
+    "    weightDataType=<FINN DataType>,\n",
+    "    outputDataType=<FINN DataType>,\n",
    "    ActVal=actval,\n",
+    "    binaryXnorMode=<0/1>,\n",
+    "    noActivation=<0/1>\n",
    ")`"
   ]
  },
@@ -76,12 +76,12 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "<font size=\"3\">Custom Ops are represented in Finn as ONNX nodes on the one hand and by a CustomOp class on the other hand. This allows easier access to the different attributes and introduces special custom op functions. See below for the standard CustomOp class.</font>"
+    "<font size=\"3\">Custom Ops are represented in FINN as ONNX nodes on the one hand and by a CustomOp class on the other hand. This allows easier access to the different attributes and introduces special custom op functions. See below for the standard CustomOp class.</font>"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
@@ -188,7 +188,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 9,
+   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
@@ -399,7 +399,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": 4,
   "metadata": {
    "scrolled": true
   },
@@ -431,15 +431,13 @@
      "                    self.get_nodeattr(\"SIMD\"),\n",
      "                    export_wdt.get_hls_datatype_str(),\n",
      "                    self.get_nodeattr(\"PE\"),\n",
-      "                    self.get_nodeattr(\"WMEM\"),\n",
+      "                    self.calc_wmem(),\n",
      "                )\n",
      "            )\n",
      "        else:\n",
      "            f_weights.write(\n",
      "                \"static BinaryWeights<{},{},{}> weights = \".format(\n",
-      "                    self.get_nodeattr(\"SIMD\"),\n",
-      "                    self.get_nodeattr(\"PE\"),\n",
-      "                    self.get_nodeattr(\"WMEM\"),\n",
+      "                    self.get_nodeattr(\"SIMD\"), self.get_nodeattr(\"PE\"), self.calc_wmem()\n",
      "                )\n",
      "            )\n",
      "        f_weights.write(weight_hls_code)\n",
@@ -453,6 +451,12 @@
      "                # use UINT32 threshold export for bipolar times bipolar\n",
      "                inp_is_bipolar = self.get_input_datatype() == DataType.BIPOLAR\n",
      "                wt_is_bipolar = self.get_weight_datatype() == DataType.BIPOLAR\n",
+      "                # reinterpret inp/wt as bipolar if bin_xnor_mode is iset\n",
+      "                inp_is_binary = self.get_input_datatype() == DataType.BINARY\n",
+      "                wt_is_binary = self.get_weight_datatype() == DataType.BINARY\n",
+      "                bin_xnor_mode = self.get_nodeattr(\"binaryXnorMode\") == 1\n",
+      "                inp_is_bipolar = inp_is_bipolar or (inp_is_binary and bin_xnor_mode)\n",
+      "                wt_is_bipolar = wt_is_bipolar or (wt_is_binary and bin_xnor_mode)\n",
      "                if inp_is_bipolar and wt_is_bipolar:\n",
      "                    tdt = DataType.UINT32\n",
      "                thresholds_hls_code = numpy_to_hls_code(\n",
@@ -470,7 +474,7 @@
      "                f_thresh.write(\n",
      "                    \"static ThresholdsActivation<{},{},{},{},{},{},{}> threshs \\\n",
      "                     = \".format(\n",
-      "                        self.get_nodeattr(\"TMEM\"),\n",
+      "                        self.calc_tmem(),\n",
      "                        self.get_nodeattr(\"PE\"),\n",
      "                        threshold_tensor.shape[-1],\n",
      "                        tdt_hls,\n",

 %% Cell type:markdown id: tags:

 # FINN - CustomOps
 -----------------------------------------------------------------
 <font size="3">This notebook should give a more detailed insight into FINN custom operation nodes. </font>

 %% Cell type:markdown id: tags:

 <font size="3">Following showSrc function is used to print the source code of function calls in the Jupyter notebook: </font>

 %% Cell type:code id: tags:

 ``` python
 import inspect

 def showSrc(what):
    print("".join(inspect.getsourcelines(what)[0]))
 ```

 %% Cell type:markdown id: tags:

 ## FINN Custom Ops
 ---------------------------
 <font size="3">FINN uses many custom operations (`op_type` in ONNX NodeProto) that are not defined in the ONNX operator schema. These custom nodes are marked with `domain="finn"` in the protobuf to identify them as such. These nodes can represent specific operations that we need for low-bit networks, or operations that are specific to a particular hardware backend.

 A very abstract version of a custom op node representing a streaming fc layer is shown below. </font>

 %% Cell type:markdown id: tags:

 `FCLayer_node = helper.make_node(
    "StreamingFCLayer_Batch",
    node_inp_list,
    node_outp_list,
    domain="finn",
    backend="fpgadataflow",
    code_gen_dir="",
    executable_path="",
    resType="ap_resource_lut()",
    MW=mw,
    MH=mh,
    SIMD=simd,
    PE=pe,
-    WMEM=wmem,
-    TMEM=tmem,
-    inputDataType=FINN-DataType,
-    weightDataType=FINN-DataType,
-    outputDataType=FINN-DataType,
+    inputDataType=<FINN DataType>,
+    weightDataType=<FINN DataType>,
+    outputDataType=<FINN DataType>,
    ActVal=actval,
+    binaryXnorMode=<0/1>,
+    noActivation=<0/1>
 )`

 %% Cell type:markdown id: tags:

 <font size="3">Unlike standard nodes, the custom op nodes has several additional attributes. The node is created using the helper function of ONNX. `"StreamingFCLayer_Batch"` describes the op_type, then the inputs and outputs are declared. Since this is a custom op node of FINN, the attribute `domain="finn"` must be set. The streaming fc layer is a custom op from the finn-hls library, this is set in the node using the `backend` attribute. To execute a custom op from the finn-hls library, the corresponding c++ code must be created and an executable must be produced. Where the generated code is stored is specified in the `code_gen_dir` attribute and `executable_path` specifies the path to the produced executable. In addition to the data types of the input and output tensors, the node also contains various other attributes resulting from the parameters of the corresponding finn-hls library function. This will not be discussed here.</font>

 %% Cell type:markdown id: tags:

-<font size="3">Custom Ops are represented in Finn as ONNX nodes on the one hand and by a CustomOp class on the other hand. This allows easier access to the different attributes and introduces special custom op functions. See below for the standard CustomOp class.</font>
+<font size="3">Custom Ops are represented in FINN as ONNX nodes on the one hand and by a CustomOp class on the other hand. This allows easier access to the different attributes and introduces special custom op functions. See below for the standard CustomOp class.</font>

 %% Cell type:code id: tags:

 ``` python
 from finn.custom_op import CustomOp
 showSrc(CustomOp)
 ```

 %% Output

    class CustomOp(ABC):
        def __init__(self, onnx_node):
            super().__init__()
            self.onnx_node = onnx_node
    
        def get_nodeattr(self, name):
            """Get a node attribute by name. Data is stored inside the ONNX node's
            AttributeProto container. Attribute must be part of get_nodeattr_types.
            Default value is returned if attribute is not set."""
            try:
                (dtype, req, def_val) = self.get_nodeattr_types()[name]
                attr = get_by_name(self.onnx_node.attribute, name)
                if attr is not None:
                    # dtype indicates which ONNX Attribute member to use
                    # (such as i, f, s...)
                    ret = attr.__getattribute__(dtype)
                    if dtype == "s":
                        # decode string attributes
                        ret = ret.decode("utf-8")
                    return ret
                else:
                    # not set, return default value
                    return def_val
            except KeyError:
                raise AttributeError("Op has no such attribute: " + name)
    
        def set_nodeattr(self, name, value):
            """Set a node attribute by name. Data is stored inside the ONNX node's
            AttributeProto container. Attribute must be part of get_nodeattr_types."""
            try:
                (dtype, req, def_val) = self.get_nodeattr_types()[name]
                attr = get_by_name(self.onnx_node.attribute, name)
                if attr is not None:
                    # dtype indicates which ONNX Attribute member to use
                    # (such as i, f, s...)
                    if dtype == "s":
                        # encode string attributes
                        value = value.encode("utf-8")
                    attr.__setattr__(dtype, value)
                else:
                    # not set, create and insert AttributeProto
                    attr_proto = helper.make_attribute(name, value)
                    self.onnx_node.attribute.append(attr_proto)
            except KeyError:
                raise AttributeError("Op has no such attribute: " + name)
    
        @abstractmethod
        def get_nodeattr_types(self):
            """Returns a dict of permitted attributes for node, where:
                returned_dict[attribute_name] = (dtype, require, default_value)
                - dtype indicates which member of the ONNX AttributeProto
                will be utilized
                - require indicates whether this attribute is required
                - default_val indicates the default value that will be used if the
                attribute is not set
            """
            pass
    
        @abstractmethod
        def make_shape_compatible_op(self):
            """Returns a standard ONNX op which is compatible with this CustomOp
            for performing shape inference."""
            pass
    
        @abstractmethod
        def infer_node_datatype(self, model):
            """Set the DataType annotations corresponding to the outputs of this
            node."""
            pass
    
        @abstractmethod
        def execute_node(self, context, graph):
            """Execute this CustomOp instance, given the execution context and
            ONNX graph."""
            pass
    

 %% Cell type:markdown id: tags:

 <font size="3">When instantiating the class, the ONNX node is passed to access all attributes of the node within the class. This is accompanied by the functions `get_nodeattr()`and `set_nodeattr()`, which each instance of this class has. Furthermore 4 abstract methods are implemented, which are described in more detail in the comments in the code. </font>

 %% Cell type:markdown id: tags:

 <font size="3">If it is a node from the finn-hls library another class is used which is derived from the CustomOp class:</font>

 %% Cell type:code id: tags:

 ``` python
 from finn.custom_op.fpgadataflow import HLSCustomOp
 showSrc(HLSCustomOp)
 ```

 %% Output

    class HLSCustomOp(CustomOp):
        def __init__(self, onnx_node):
            super().__init__(onnx_node)
            # template for single node execution
            self.docompute_template = """
            #include "cnpy.h"
            #include "npy2apintstream.hpp"
            #include <vector>
            #include "bnn-library.h"
    
            // includes for network parameters
            $GLOBALS$
    
            // defines for network parameters
            $DEFINES$
    
            int main(){
    
            $STREAMDECLARATIONS$
    
            $READNPYDATA$
    
            $DOCOMPUTE$
    
            $DATAOUTSTREAM$
    
            $SAVEASCNPY$
    
            }
    
            """
            self.code_gen_dict = {}
    
        def get_nodeattr_types(self):
            return {"code_gen_dir": ("s", False, ""), "executable_path": ("s", False, "")}
    
        def code_generation(self, model):
            node = self.onnx_node
            self.generate_params(model)
            self.global_includes()
            self.defines()
            self.read_npy_data()
            self.strm_decl()
            self.docompute()
            self.dataoutstrm()
            self.save_as_npy()
    
            template = self.docompute_template
    
            for key in self.code_gen_dict:
                # transform list into long string separated by '\n'
                code_gen_line = "\n".join(self.code_gen_dict[key])
                template = template.replace(key, code_gen_line)
            code_gen_dir = self.get_nodeattr("code_gen_dir")
            f = open(os.path.join(code_gen_dir, "execute_{}.cpp".format(node.op_type)), "w")
            f.write(template)
            f.close()
    
        def compile_singlenode_code(self):
            code_gen_dir = self.get_nodeattr("code_gen_dir")
            builder = CppBuilder()
            builder.append_includes("-I/workspace/finn/src/finn/data/cpp")
            builder.append_includes("-I/workspace/cnpy/")
            builder.append_includes("-I/workspace/finn-hlslib")
            builder.append_includes("-I/workspace/vivado-hlslib")
            builder.append_includes("--std=c++11")
            builder.append_sources(code_gen_dir + "/*.cpp")
            builder.append_sources("/workspace/cnpy/cnpy.cpp")
            builder.append_includes("-lz")
            builder.set_executable_path(code_gen_dir + "/node_model")
            builder.build(code_gen_dir)
            self.set_nodeattr("executable_path", builder.executable_path)
    
        def dynamic_input_to_npy(self, context, count):
            node = self.onnx_node
            code_gen_dir = self.get_nodeattr("code_gen_dir")
            if code_gen_dir == "":
                raise Exception(
                    """
    Found no codegen dir for this node, did you run the codegen transformation?
                """
                )
            # create a npy file for each input of the node (in_ind is input index)
            # assuming dynamic inputs start from 0
            for in_ind in range(count):
                current_input_name = node.input[in_ind]
                np.save(
                    os.path.join(code_gen_dir, "input_{}.npy".format(in_ind)),
                    context[current_input_name],
                )
    
        def npy_to_dynamic_output(self, context):
            # TODO support multi-output nodes as needed
            node = self.onnx_node
            code_gen_dir = self.get_nodeattr("code_gen_dir")
            output = np.load("{}/output.npy".format(code_gen_dir))
            context[node.output[0]] = output
    
        def exec_precompiled_singlenode_model(self):
            # execute precompiled executable
            executable_path = self.get_nodeattr("executable_path")
            if executable_path == "":
                raise Exception(
                    """
    Found no executable for this node, did you run the codegen and
    compilation transformations?
                """
                )
            process_execute = subprocess.Popen(executable_path, stdout=subprocess.PIPE)
            process_execute.communicate()
    
        def execute_node(self, context, graph):
            # save input(s)
            self.dynamic_input_to_npy(context, 1)
            # execute the precompiled model
            self.exec_precompiled_singlenode_model()
            # load output npy file
            self.npy_to_dynamic_output(context)
    
        def generate_params(self, model):
            pass
    
        @abstractmethod
        def global_includes(self):
            pass
    
        @abstractmethod
        def defines(self):
            pass
    
        @abstractmethod
        def read_npy_data(self):
            pass
    
        @abstractmethod
        def strm_decl(self):
            pass
    
        @abstractmethod
        def docompute(self):
            pass
    
        @abstractmethod
        def dataoutstrm(self):
            pass
    
        @abstractmethod
        def save_as_npy(self):
            pass
    

 %% Cell type:markdown id: tags:

 <font size="3">When creating an instance of this class, a template is introduced, which forms the layout for the c++ code to execute the node. It has some general constructs, like the inclusion of bnn-library.h, which contains the references to the finn-hls library, and of cnpy.h and npy2apintstream.hpp, which support the transfer of python numpy arrays in c++. The idea of this template is to replace the variables marked with `$ $` with c++ calls during code generation. Then the template can be written into a .cpp file and be compiled.

 **`get_nodeattr_types()`**: each instance of the HLSCustomOp class must have the attributes `code_gen_dir` and `executable_path`, since to execute these nodes c++ code must be generated and correspondingly the executables.

 </font>


 %% Cell type:markdown id: tags:

 <font size="3">**`code_generation(model)`**: all functions required for code generation are called and the `$ $` variables in the template are replaced accordingly and written into a .cpp file. Almost all of these subfunctions are implemented as abstract methods in the class, so they are completely customized for each custom op node. A special function is `generate_params()`. This is not implemented as an abstract method, but as a normal function, but contains by default only `pass`. This is because some custom op nodes do not have parameters that need to be generated and in this way the function is skipped. For example for a streaming fc layer node a parameter generation is necessary. How such a parameter generation can look like is described in more detail in the course of this notebook.
 </font>

 %% Cell type:markdown id: tags:

 <font size="3">**`compile_singlenode_code()`**: To compile the generated code, the compile command must be built. This is done in this function. It creates an instance of the `CppBuilder()` class and assembles the various components for the function. The `.build` function creates the executable and then sets the corresponding attribute. The class `CppBuilder` is a transformation and a more detailed description can be found in Jupyter notebook *FINN-CodeGenerationAndCompilation*.
 </font>

 %% Cell type:markdown id: tags:

 <font size="3">**`dynamic_input_to_npy(context, count)`**:</font>

 %% Cell type:markdown id: tags:

 #### Generate Parameter
 <font size="3">Parameters have to be generated for specific types of HLSCustomOps. For example if the node is a streaming fc layer, there are weights and activation values, which are written to separate .h files and added to the template using `#include`. For streaming fc layer the parameter generation looks like this:
 </font>

 %% Cell type:code id: tags:

 ``` python
 from finn.custom_op.fpgadataflow.streamingfclayer_batch import StreamingFCLayer_Batch
 showSrc(StreamingFCLayer_Batch.generate_params)
 ```

 %% Output

        def generate_params(self, model):
            # weights
            weights = model.get_initializer(self.onnx_node.input[1])
            # convert weights into hlslib-compatible format
            weight_tensor = self.get_hls_compatible_weight_tensor(weights)
            export_wdt = self.get_weight_datatype()
            # we have converted bipolar weights to binary for export,
            # so use it as such for weight generation
            if self.get_weight_datatype() == DataType.BIPOLAR:
                export_wdt = DataType.BINARY
            weight_hls_code = numpy_to_hls_code(
                weight_tensor, export_wdt, "weights", True, True
            )
            # write weights into params.h
            code_gen_dir = self.get_nodeattr("code_gen_dir")
            f_weights = open("{}/params.h".format(code_gen_dir), "w")
    
            if export_wdt.bitwidth() != 1:
                f_weights.write(
                    "static FixedPointWeights<{},{},{},{}> weights = ".format(
                        self.get_nodeattr("SIMD"),
                        export_wdt.get_hls_datatype_str(),
                        self.get_nodeattr("PE"),
-                        self.get_nodeattr("WMEM"),
+                        self.calc_wmem(),
                    )
                )
            else:
                f_weights.write(
                    "static BinaryWeights<{},{},{}> weights = ".format(
-                        self.get_nodeattr("SIMD"),
-                        self.get_nodeattr("PE"),
-                        self.get_nodeattr("WMEM"),
+                        self.get_nodeattr("SIMD"), self.get_nodeattr("PE"), self.calc_wmem()
                    )
                )
            f_weights.write(weight_hls_code)
            f_weights.close()
            # thresholds
            if len(self.onnx_node.input) > 2:
                thresholds = model.get_initializer(self.onnx_node.input[2])
                if thresholds is not None:
                    threshold_tensor = self.get_hls_compatible_threshold_tensor(thresholds)
                    tdt = DataType.INT32
                    # use UINT32 threshold export for bipolar times bipolar
                    inp_is_bipolar = self.get_input_datatype() == DataType.BIPOLAR
                    wt_is_bipolar = self.get_weight_datatype() == DataType.BIPOLAR
+                    # reinterpret inp/wt as bipolar if bin_xnor_mode is iset
+                    inp_is_binary = self.get_input_datatype() == DataType.BINARY
+                    wt_is_binary = self.get_weight_datatype() == DataType.BINARY
+                    bin_xnor_mode = self.get_nodeattr("binaryXnorMode") == 1
+                    inp_is_bipolar = inp_is_bipolar or (inp_is_binary and bin_xnor_mode)
+                    wt_is_bipolar = wt_is_bipolar or (wt_is_binary and bin_xnor_mode)
                    if inp_is_bipolar and wt_is_bipolar:
                        tdt = DataType.UINT32
                    thresholds_hls_code = numpy_to_hls_code(
                        threshold_tensor, tdt, "thresholds", False, True
                    )
                    # write thresholds into thresh.h
                    code_gen_dir = self.get_nodeattr("code_gen_dir")
                    f_thresh = open("{}/thresh.h".format(code_gen_dir), "w")
                    tdt_hls = tdt.get_hls_datatype_str()
                    # use binary to export bipolar activations
                    export_odt = self.get_output_datatype()
                    if self.get_output_datatype() == DataType.BIPOLAR:
                        export_odt = DataType.BINARY
                    odt_hls = export_odt.get_hls_datatype_str()
                    f_thresh.write(
                        "static ThresholdsActivation<{},{},{},{},{},{},{}> threshs \
                         = ".format(
-                            self.get_nodeattr("TMEM"),
+                            self.calc_tmem(),
                            self.get_nodeattr("PE"),
                            threshold_tensor.shape[-1],
                            tdt_hls,
                            odt_hls,
                            self.get_nodeattr("ActVal"),
                            "std::less_equal<%s>" % tdt_hls,
                        )
                    )
                    f_thresh.write(thresholds_hls_code)
                    f_thresh.close()
    

 %% Cell type:code id: tags:

 ``` python
 ```