diff --git a/notebooks/FINN-CodeGenerationAndCompilation.ipynb b/notebooks/FINN-CodeGenerationAndCompilation.ipynb
index 4549e2befe2027e1f3ec171a60e8d802d8aebb55..922693c8e9e12cc799b07db4bf30400cd56f803d 100644
--- a/notebooks/FINN-CodeGenerationAndCompilation.ipynb
+++ b/notebooks/FINN-CodeGenerationAndCompilation.ipynb
@@ -6,463 +6,7 @@
    "source": [
     "# FINN - Code Generation and Compilation\n",
     "-----------------------------------------------------------------\n",
-    "<font size=\"3\">This notebook should give a more detailed insight into the code generation and compilation within FINN. </font>"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "<font size=\"3\">Following showSrc function is used to print the source code of function calls in the Jupyter notebook: </font>"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import inspect\n",
-    "\n",
-    "def showSrc(what):\n",
-    "    print(\"\".join(inspect.getsourcelines(what)[0]))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Outline\n",
-    "* <font size=\"3\">FINN Custom Ops </font>"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## FINN Custom Ops\n",
-    "---------------------------\n",
-    "<font size=\"3\">FINN uses many custom operations (`op_type` in ONNX NodeProto) that are not defined in the ONNX operator schema. These custom nodes are marked with `domain=\"finn\"` in the protobuf to identify them as such. These nodes can represent specific operations that we need for low-bit networks, or operations that are specific to a particular hardware backend.\n",
-    "\n",
-    "A very abstract version of a custom op node representing a streaming fc layer is shown below. </font>"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "`FCLayer_node = helper.make_node(\n",
-    "    \"StreamingFCLayer_Batch\",\n",
-    "    node_inp_list,\n",
-    "    node_outp_list,\n",
-    "    domain=\"finn\",\n",
-    "    backend=\"fpgadataflow\",\n",
-    "    code_gen_dir=\"\",\n",
-    "    executable_path=\"\",\n",
-    "    resType=\"ap_resource_lut()\",\n",
-    "    MW=mw,\n",
-    "    MH=mh,\n",
-    "    SIMD=simd,\n",
-    "    PE=pe,\n",
-    "    WMEM=wmem,\n",
-    "    TMEM=tmem,\n",
-    "    inputDataType=FINN-DataType,\n",
-    "    weightDataType=FINN-DataType,\n",
-    "    outputDataType=FINN-DataType,\n",
-    "    ActVal=actval,\n",
-    ")`"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    " <font size=\"3\">Unlike standard nodes, the custom op nodes has several additional attributes. The node is created using the helper function of ONNX. `\"StreamingFCLayer_Batch\"` describes the op_type, then the inputs and outputs are declared. Since this is a custom op node of FINN, the attribute `domain=\"finn\"` must be set. The streaming fc layer is a custom op from the finn-hls library, this is set in the node using the `backend` attribute. To execute a custom op from the finn-hls library, the corresponding c++ code must be created and an executable must be produced. Where the generated code is stored is specified in the `code_gen_dir` attribute and `executable_path` specifies the path to the produced executable. In addition to the data types of the input and output tensors, the node also contains various other attributes resulting from the parameters of the corresponding finn-hls library function. This will not be discussed here.</font>"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "<font size=\"3\">Custom Ops are represented in Finn as ONNX nodes on the one hand and by a CustomOp class on the other hand. This allows easier access to the different attributes and introduces special custom op functions. See below for the standard CustomOp class.</font>"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "class CustomOp(ABC):\n",
-      "    def __init__(self, onnx_node):\n",
-      "        super().__init__()\n",
-      "        self.onnx_node = onnx_node\n",
-      "\n",
-      "    def get_nodeattr(self, name):\n",
-      "        \"\"\"Get a node attribute by name. Data is stored inside the ONNX node's\n",
-      "        AttributeProto container. Attribute must be part of get_nodeattr_types.\n",
-      "        Default value is returned if attribute is not set.\"\"\"\n",
-      "        try:\n",
-      "            (dtype, req, def_val) = self.get_nodeattr_types()[name]\n",
-      "            attr = get_by_name(self.onnx_node.attribute, name)\n",
-      "            if attr is not None:\n",
-      "                # dtype indicates which ONNX Attribute member to use\n",
-      "                # (such as i, f, s...)\n",
-      "                ret = attr.__getattribute__(dtype)\n",
-      "                if dtype == \"s\":\n",
-      "                    # decode string attributes\n",
-      "                    ret = ret.decode(\"utf-8\")\n",
-      "                return ret\n",
-      "            else:\n",
-      "                # not set, return default value\n",
-      "                return def_val\n",
-      "        except KeyError:\n",
-      "            raise AttributeError(\"Op has no such attribute: \" + name)\n",
-      "\n",
-      "    def set_nodeattr(self, name, value):\n",
-      "        \"\"\"Set a node attribute by name. Data is stored inside the ONNX node's\n",
-      "        AttributeProto container. Attribute must be part of get_nodeattr_types.\"\"\"\n",
-      "        try:\n",
-      "            (dtype, req, def_val) = self.get_nodeattr_types()[name]\n",
-      "            attr = get_by_name(self.onnx_node.attribute, name)\n",
-      "            if attr is not None:\n",
-      "                # dtype indicates which ONNX Attribute member to use\n",
-      "                # (such as i, f, s...)\n",
-      "                if dtype == \"s\":\n",
-      "                    # encode string attributes\n",
-      "                    value = value.encode(\"utf-8\")\n",
-      "                attr.__setattr__(dtype, value)\n",
-      "            else:\n",
-      "                # not set, create and insert AttributeProto\n",
-      "                attr_proto = helper.make_attribute(name, value)\n",
-      "                self.onnx_node.attribute.append(attr_proto)\n",
-      "        except KeyError:\n",
-      "            raise AttributeError(\"Op has no such attribute: \" + name)\n",
-      "\n",
-      "    @abstractmethod\n",
-      "    def get_nodeattr_types(self):\n",
-      "        \"\"\"Returns a dict of permitted attributes for node, where:\n",
-      "            returned_dict[attribute_name] = (dtype, require, default_value)\n",
-      "            - dtype indicates which member of the ONNX AttributeProto\n",
-      "            will be utilized\n",
-      "            - require indicates whether this attribute is required\n",
-      "            - default_val indicates the default value that will be used if the\n",
-      "            attribute is not set\n",
-      "        \"\"\"\n",
-      "        pass\n",
-      "\n",
-      "    @abstractmethod\n",
-      "    def make_shape_compatible_op(self):\n",
-      "        \"\"\"Returns a standard ONNX op which is compatible with this CustomOp\n",
-      "        for performing shape inference.\"\"\"\n",
-      "        pass\n",
-      "\n",
-      "    @abstractmethod\n",
-      "    def infer_node_datatype(self, model):\n",
-      "        \"\"\"Set the DataType annotations corresponding to the outputs of this\n",
-      "        node.\"\"\"\n",
-      "        pass\n",
-      "\n",
-      "    @abstractmethod\n",
-      "    def execute_node(self, context, graph):\n",
-      "        \"\"\"Execute this CustomOp instance, given the execution context and\n",
-      "        ONNX graph.\"\"\"\n",
-      "        pass\n",
-      "\n"
-     ]
-    }
-   ],
-   "source": [
-    "from finn.custom_op import CustomOp\n",
-    "showSrc(CustomOp)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "<font size=\"3\">When instantiating the class, the ONNX node is passed to access all attributes of the node within the class. This is accompanied by the functions `get_nodeattr()`and `set_nodeattr()`, which each instance of this class has. Furthermore 4 abstract methods are implemented, which are described in more detail in the comments in the code. </font>"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "<font size=\"3\">If it is a node from the finn-hls library another class is used which is derived from the CustomOp class:</font>"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 9,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "class HLSCustomOp(CustomOp):\n",
-      "    def __init__(self, onnx_node):\n",
-      "        super().__init__(onnx_node)\n",
-      "        # template for single node execution\n",
-      "        self.docompute_template = \"\"\"\n",
-      "        #include \"cnpy.h\"\n",
-      "        #include \"npy2apintstream.hpp\"\n",
-      "        #include <vector>\n",
-      "        #include \"bnn-library.h\"\n",
-      "\n",
-      "        // includes for network parameters\n",
-      "        $GLOBALS$\n",
-      "\n",
-      "        // defines for network parameters\n",
-      "        $DEFINES$\n",
-      "\n",
-      "        int main(){\n",
-      "\n",
-      "        $STREAMDECLARATIONS$\n",
-      "\n",
-      "        $READNPYDATA$\n",
-      "\n",
-      "        $DOCOMPUTE$\n",
-      "\n",
-      "        $DATAOUTSTREAM$\n",
-      "\n",
-      "        $SAVEASCNPY$\n",
-      "\n",
-      "        }\n",
-      "\n",
-      "        \"\"\"\n",
-      "        self.code_gen_dict = {}\n",
-      "\n",
-      "    def get_nodeattr_types(self):\n",
-      "        return {\"code_gen_dir\": (\"s\", False, \"\"), \"executable_path\": (\"s\", False, \"\")}\n",
-      "\n",
-      "    def code_generation(self, model):\n",
-      "        node = self.onnx_node\n",
-      "        self.generate_params(model)\n",
-      "        self.global_includes()\n",
-      "        self.defines()\n",
-      "        self.read_npy_data()\n",
-      "        self.strm_decl()\n",
-      "        self.docompute()\n",
-      "        self.dataoutstrm()\n",
-      "        self.save_as_npy()\n",
-      "\n",
-      "        template = self.docompute_template\n",
-      "\n",
-      "        for key in self.code_gen_dict:\n",
-      "            # transform list into long string separated by '\\n'\n",
-      "            code_gen_line = \"\\n\".join(self.code_gen_dict[key])\n",
-      "            template = template.replace(key, code_gen_line)\n",
-      "        code_gen_dir = self.get_nodeattr(\"code_gen_dir\")\n",
-      "        f = open(os.path.join(code_gen_dir, \"execute_{}.cpp\".format(node.op_type)), \"w\")\n",
-      "        f.write(template)\n",
-      "        f.close()\n",
-      "\n",
-      "    def compile_singlenode_code(self):\n",
-      "        code_gen_dir = self.get_nodeattr(\"code_gen_dir\")\n",
-      "        builder = CppBuilder()\n",
-      "        builder.append_includes(\"-I/workspace/finn/src/finn/data/cpp\")\n",
-      "        builder.append_includes(\"-I/workspace/cnpy/\")\n",
-      "        builder.append_includes(\"-I/workspace/finn-hlslib\")\n",
-      "        builder.append_includes(\"-I/workspace/vivado-hlslib\")\n",
-      "        builder.append_includes(\"--std=c++11\")\n",
-      "        builder.append_sources(code_gen_dir + \"/*.cpp\")\n",
-      "        builder.append_sources(\"/workspace/cnpy/cnpy.cpp\")\n",
-      "        builder.append_includes(\"-lz\")\n",
-      "        builder.set_executable_path(code_gen_dir + \"/node_model\")\n",
-      "        builder.build(code_gen_dir)\n",
-      "        self.set_nodeattr(\"executable_path\", builder.executable_path)\n",
-      "\n",
-      "    def dynamic_input_to_npy(self, context, count):\n",
-      "        node = self.onnx_node\n",
-      "        code_gen_dir = self.get_nodeattr(\"code_gen_dir\")\n",
-      "        if code_gen_dir == \"\":\n",
-      "            raise Exception(\n",
-      "                \"\"\"\n",
-      "Found no codegen dir for this node, did you run the codegen transformation?\n",
-      "            \"\"\"\n",
-      "            )\n",
-      "        # create a npy file for each input of the node (in_ind is input index)\n",
-      "        # assuming dynamic inputs start from 0\n",
-      "        for in_ind in range(count):\n",
-      "            current_input_name = node.input[in_ind]\n",
-      "            np.save(\n",
-      "                os.path.join(code_gen_dir, \"input_{}.npy\".format(in_ind)),\n",
-      "                context[current_input_name],\n",
-      "            )\n",
-      "\n",
-      "    def npy_to_dynamic_output(self, context):\n",
-      "        # TODO support multi-output nodes as needed\n",
-      "        node = self.onnx_node\n",
-      "        code_gen_dir = self.get_nodeattr(\"code_gen_dir\")\n",
-      "        output = np.load(\"{}/output.npy\".format(code_gen_dir))\n",
-      "        context[node.output[0]] = output\n",
-      "\n",
-      "    def exec_precompiled_singlenode_model(self):\n",
-      "        # execute precompiled executable\n",
-      "        executable_path = self.get_nodeattr(\"executable_path\")\n",
-      "        if executable_path == \"\":\n",
-      "            raise Exception(\n",
-      "                \"\"\"\n",
-      "Found no executable for this node, did you run the codegen and\n",
-      "compilation transformations?\n",
-      "            \"\"\"\n",
-      "            )\n",
-      "        process_execute = subprocess.Popen(executable_path, stdout=subprocess.PIPE)\n",
-      "        process_execute.communicate()\n",
-      "\n",
-      "    def execute_node(self, context, graph):\n",
-      "        # save input(s)\n",
-      "        self.dynamic_input_to_npy(context, 1)\n",
-      "        # execute the precompiled model\n",
-      "        self.exec_precompiled_singlenode_model()\n",
-      "        # load output npy file\n",
-      "        self.npy_to_dynamic_output(context)\n",
-      "\n",
-      "    def generate_params(self, model):\n",
-      "        pass\n",
-      "\n",
-      "    @abstractmethod\n",
-      "    def global_includes(self):\n",
-      "        pass\n",
-      "\n",
-      "    @abstractmethod\n",
-      "    def defines(self):\n",
-      "        pass\n",
-      "\n",
-      "    @abstractmethod\n",
-      "    def read_npy_data(self):\n",
-      "        pass\n",
-      "\n",
-      "    @abstractmethod\n",
-      "    def strm_decl(self):\n",
-      "        pass\n",
-      "\n",
-      "    @abstractmethod\n",
-      "    def docompute(self):\n",
-      "        pass\n",
-      "\n",
-      "    @abstractmethod\n",
-      "    def dataoutstrm(self):\n",
-      "        pass\n",
-      "\n",
-      "    @abstractmethod\n",
-      "    def save_as_npy(self):\n",
-      "        pass\n",
-      "\n"
-     ]
-    }
-   ],
-   "source": [
-    "from finn.custom_op.fpgadataflow import HLSCustomOp\n",
-    "showSrc(HLSCustomOp)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "<font size=\"3\">When creating an instance of this class, a template is introduced, which forms the layout for the c++ code to execute the node. It has some general constructs, like the inclusion of bnn-library.h, which contains the references to the finn-hls library, and of cnpy.h and npy2apintstream.hpp, which support the transfer of python numpy arrays in c++. The idea of this template is to replace the variables marked with `$ $` with c++ calls during code generation. Then the template can be written into a .cpp file and be compiled.\n",
-    "\n",
-    "Each instance of this class must have an attribute `code_gen_dir` and `executable_path`, since to execute these nodes c++ code must be generated and correspondingly the executables. This is specified in the `get_nodeattr_types()` function.\n",
-    "\n",
-    "In the function `code_generation()` all functions required for code generation are called and the `$ $` variables in the template are replaced accordingly and written into a .cpp file. A special function is `generate_params()`. For example, if the node is a streaming fc layer, there are weights and activation values, which are written to separate .h and added to the template using `#include`. For streaming fc layer it looks like this:\n",
-    "</font>\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 10,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "    def generate_params(self, model):\n",
-      "        # weights\n",
-      "        weights = model.get_initializer(self.onnx_node.input[1])\n",
-      "        # convert weights into hlslib-compatible format\n",
-      "        weight_tensor = self.get_hls_compatible_weight_tensor(weights)\n",
-      "        export_wdt = self.get_weight_datatype()\n",
-      "        # we have converted bipolar weights to binary for export,\n",
-      "        # so use it as such for weight generation\n",
-      "        if self.get_weight_datatype() == DataType.BIPOLAR:\n",
-      "            export_wdt = DataType.BINARY\n",
-      "        weight_hls_code = numpy_to_hls_code(\n",
-      "            weight_tensor, export_wdt, \"weights\", True, True\n",
-      "        )\n",
-      "        # write weights into params.h\n",
-      "        code_gen_dir = self.get_nodeattr(\"code_gen_dir\")\n",
-      "        f_weights = open(\"{}/params.h\".format(code_gen_dir), \"w\")\n",
-      "\n",
-      "        if export_wdt.bitwidth() != 1:\n",
-      "            f_weights.write(\n",
-      "                \"static FixedPointWeights<{},{},{},{}> weights = \".format(\n",
-      "                    self.get_nodeattr(\"SIMD\"),\n",
-      "                    export_wdt.get_hls_datatype_str(),\n",
-      "                    self.get_nodeattr(\"PE\"),\n",
-      "                    self.get_nodeattr(\"WMEM\"),\n",
-      "                )\n",
-      "            )\n",
-      "        else:\n",
-      "            f_weights.write(\n",
-      "                \"static BinaryWeights<{},{},{}> weights = \".format(\n",
-      "                    self.get_nodeattr(\"SIMD\"),\n",
-      "                    self.get_nodeattr(\"PE\"),\n",
-      "                    self.get_nodeattr(\"WMEM\"),\n",
-      "                )\n",
-      "            )\n",
-      "        f_weights.write(weight_hls_code)\n",
-      "        f_weights.close()\n",
-      "        # thresholds\n",
-      "        if len(self.onnx_node.input) > 2:\n",
-      "            thresholds = model.get_initializer(self.onnx_node.input[2])\n",
-      "            if thresholds is not None:\n",
-      "                threshold_tensor = self.get_hls_compatible_threshold_tensor(thresholds)\n",
-      "                tdt = DataType.INT32\n",
-      "                # use UINT32 threshold export for bipolar times bipolar\n",
-      "                inp_is_bipolar = self.get_input_datatype() == DataType.BIPOLAR\n",
-      "                wt_is_bipolar = self.get_weight_datatype() == DataType.BIPOLAR\n",
-      "                if inp_is_bipolar and wt_is_bipolar:\n",
-      "                    tdt = DataType.UINT32\n",
-      "                thresholds_hls_code = numpy_to_hls_code(\n",
-      "                    threshold_tensor, tdt, \"thresholds\", False, True\n",
-      "                )\n",
-      "                # write thresholds into thresh.h\n",
-      "                code_gen_dir = self.get_nodeattr(\"code_gen_dir\")\n",
-      "                f_thresh = open(\"{}/thresh.h\".format(code_gen_dir), \"w\")\n",
-      "                tdt_hls = tdt.get_hls_datatype_str()\n",
-      "                # use binary to export bipolar activations\n",
-      "                export_odt = self.get_output_datatype()\n",
-      "                if self.get_output_datatype() == DataType.BIPOLAR:\n",
-      "                    export_odt = DataType.BINARY\n",
-      "                odt_hls = export_odt.get_hls_datatype_str()\n",
-      "                f_thresh.write(\n",
-      "                    \"static ThresholdsActivation<{},{},{},{},{},{},{}> threshs \\\n",
-      "                     = \".format(\n",
-      "                        self.get_nodeattr(\"TMEM\"),\n",
-      "                        self.get_nodeattr(\"PE\"),\n",
-      "                        threshold_tensor.shape[-1],\n",
-      "                        tdt_hls,\n",
-      "                        odt_hls,\n",
-      "                        self.get_nodeattr(\"ActVal\"),\n",
-      "                        \"std::less_equal<%s>\" % tdt_hls,\n",
-      "                    )\n",
-      "                )\n",
-      "                f_thresh.write(thresholds_hls_code)\n",
-      "                f_thresh.close()\n",
-      "\n"
-     ]
-    }
-   ],
-   "source": [
-    "from finn.custom_op.fpgadataflow.streamingfclayer_batch import StreamingFCLayer_Batch\n",
-    "showSrc(StreamingFCLayer_Batch.generate_params)"
+    "This notebook is about code generation and compilation to enable execution of FINN "
    ]
   },
   {
diff --git a/notebooks/FINN-CustomOps.ipynb b/notebooks/FINN-CustomOps.ipynb
new file mode 100644
index 0000000000000000000000000000000000000000..1f60546e342564dfbd6b3edf5472f0ba63ddbb0a
--- /dev/null
+++ b/notebooks/FINN-CustomOps.ipynb
@@ -0,0 +1,522 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# FINN - CustomOps\n",
+    "-----------------------------------------------------------------\n",
+    "<font size=\"3\">This notebook should give a more detailed insight into FINN custom operation nodes. </font>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<font size=\"3\">Following showSrc function is used to print the source code of function calls in the Jupyter notebook: </font>"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import inspect\n",
+    "\n",
+    "def showSrc(what):\n",
+    "    print(\"\".join(inspect.getsourcelines(what)[0]))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## FINN Custom Ops\n",
+    "---------------------------\n",
+    "<font size=\"3\">FINN uses many custom operations (`op_type` in ONNX NodeProto) that are not defined in the ONNX operator schema. These custom nodes are marked with `domain=\"finn\"` in the protobuf to identify them as such. These nodes can represent specific operations that we need for low-bit networks, or operations that are specific to a particular hardware backend.\n",
+    "\n",
+    "A very abstract version of a custom op node representing a streaming fc layer is shown below. </font>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "`FCLayer_node = helper.make_node(\n",
+    "    \"StreamingFCLayer_Batch\",\n",
+    "    node_inp_list,\n",
+    "    node_outp_list,\n",
+    "    domain=\"finn\",\n",
+    "    backend=\"fpgadataflow\",\n",
+    "    code_gen_dir=\"\",\n",
+    "    executable_path=\"\",\n",
+    "    resType=\"ap_resource_lut()\",\n",
+    "    MW=mw,\n",
+    "    MH=mh,\n",
+    "    SIMD=simd,\n",
+    "    PE=pe,\n",
+    "    WMEM=wmem,\n",
+    "    TMEM=tmem,\n",
+    "    inputDataType=FINN-DataType,\n",
+    "    weightDataType=FINN-DataType,\n",
+    "    outputDataType=FINN-DataType,\n",
+    "    ActVal=actval,\n",
+    ")`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    " <font size=\"3\">Unlike standard nodes, the custom op nodes has several additional attributes. The node is created using the helper function of ONNX. `\"StreamingFCLayer_Batch\"` describes the op_type, then the inputs and outputs are declared. Since this is a custom op node of FINN, the attribute `domain=\"finn\"` must be set. The streaming fc layer is a custom op from the finn-hls library, this is set in the node using the `backend` attribute. To execute a custom op from the finn-hls library, the corresponding c++ code must be created and an executable must be produced. Where the generated code is stored is specified in the `code_gen_dir` attribute and `executable_path` specifies the path to the produced executable. In addition to the data types of the input and output tensors, the node also contains various other attributes resulting from the parameters of the corresponding finn-hls library function. This will not be discussed here.</font>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<font size=\"3\">Custom Ops are represented in Finn as ONNX nodes on the one hand and by a CustomOp class on the other hand. This allows easier access to the different attributes and introduces special custom op functions. See below for the standard CustomOp class.</font>"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "class CustomOp(ABC):\n",
+      "    def __init__(self, onnx_node):\n",
+      "        super().__init__()\n",
+      "        self.onnx_node = onnx_node\n",
+      "\n",
+      "    def get_nodeattr(self, name):\n",
+      "        \"\"\"Get a node attribute by name. Data is stored inside the ONNX node's\n",
+      "        AttributeProto container. Attribute must be part of get_nodeattr_types.\n",
+      "        Default value is returned if attribute is not set.\"\"\"\n",
+      "        try:\n",
+      "            (dtype, req, def_val) = self.get_nodeattr_types()[name]\n",
+      "            attr = get_by_name(self.onnx_node.attribute, name)\n",
+      "            if attr is not None:\n",
+      "                # dtype indicates which ONNX Attribute member to use\n",
+      "                # (such as i, f, s...)\n",
+      "                ret = attr.__getattribute__(dtype)\n",
+      "                if dtype == \"s\":\n",
+      "                    # decode string attributes\n",
+      "                    ret = ret.decode(\"utf-8\")\n",
+      "                return ret\n",
+      "            else:\n",
+      "                # not set, return default value\n",
+      "                return def_val\n",
+      "        except KeyError:\n",
+      "            raise AttributeError(\"Op has no such attribute: \" + name)\n",
+      "\n",
+      "    def set_nodeattr(self, name, value):\n",
+      "        \"\"\"Set a node attribute by name. Data is stored inside the ONNX node's\n",
+      "        AttributeProto container. Attribute must be part of get_nodeattr_types.\"\"\"\n",
+      "        try:\n",
+      "            (dtype, req, def_val) = self.get_nodeattr_types()[name]\n",
+      "            attr = get_by_name(self.onnx_node.attribute, name)\n",
+      "            if attr is not None:\n",
+      "                # dtype indicates which ONNX Attribute member to use\n",
+      "                # (such as i, f, s...)\n",
+      "                if dtype == \"s\":\n",
+      "                    # encode string attributes\n",
+      "                    value = value.encode(\"utf-8\")\n",
+      "                attr.__setattr__(dtype, value)\n",
+      "            else:\n",
+      "                # not set, create and insert AttributeProto\n",
+      "                attr_proto = helper.make_attribute(name, value)\n",
+      "                self.onnx_node.attribute.append(attr_proto)\n",
+      "        except KeyError:\n",
+      "            raise AttributeError(\"Op has no such attribute: \" + name)\n",
+      "\n",
+      "    @abstractmethod\n",
+      "    def get_nodeattr_types(self):\n",
+      "        \"\"\"Returns a dict of permitted attributes for node, where:\n",
+      "            returned_dict[attribute_name] = (dtype, require, default_value)\n",
+      "            - dtype indicates which member of the ONNX AttributeProto\n",
+      "            will be utilized\n",
+      "            - require indicates whether this attribute is required\n",
+      "            - default_val indicates the default value that will be used if the\n",
+      "            attribute is not set\n",
+      "        \"\"\"\n",
+      "        pass\n",
+      "\n",
+      "    @abstractmethod\n",
+      "    def make_shape_compatible_op(self):\n",
+      "        \"\"\"Returns a standard ONNX op which is compatible with this CustomOp\n",
+      "        for performing shape inference.\"\"\"\n",
+      "        pass\n",
+      "\n",
+      "    @abstractmethod\n",
+      "    def infer_node_datatype(self, model):\n",
+      "        \"\"\"Set the DataType annotations corresponding to the outputs of this\n",
+      "        node.\"\"\"\n",
+      "        pass\n",
+      "\n",
+      "    @abstractmethod\n",
+      "    def execute_node(self, context, graph):\n",
+      "        \"\"\"Execute this CustomOp instance, given the execution context and\n",
+      "        ONNX graph.\"\"\"\n",
+      "        pass\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "from finn.custom_op import CustomOp\n",
+    "showSrc(CustomOp)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<font size=\"3\">When instantiating the class, the ONNX node is passed to access all attributes of the node within the class. This is accompanied by the functions `get_nodeattr()`and `set_nodeattr()`, which each instance of this class has. Furthermore 4 abstract methods are implemented, which are described in more detail in the comments in the code. </font>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<font size=\"3\">If it is a node from the finn-hls library another class is used which is derived from the CustomOp class:</font>"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "class HLSCustomOp(CustomOp):\n",
+      "    def __init__(self, onnx_node):\n",
+      "        super().__init__(onnx_node)\n",
+      "        # template for single node execution\n",
+      "        self.docompute_template = \"\"\"\n",
+      "        #include \"cnpy.h\"\n",
+      "        #include \"npy2apintstream.hpp\"\n",
+      "        #include <vector>\n",
+      "        #include \"bnn-library.h\"\n",
+      "\n",
+      "        // includes for network parameters\n",
+      "        $GLOBALS$\n",
+      "\n",
+      "        // defines for network parameters\n",
+      "        $DEFINES$\n",
+      "\n",
+      "        int main(){\n",
+      "\n",
+      "        $STREAMDECLARATIONS$\n",
+      "\n",
+      "        $READNPYDATA$\n",
+      "\n",
+      "        $DOCOMPUTE$\n",
+      "\n",
+      "        $DATAOUTSTREAM$\n",
+      "\n",
+      "        $SAVEASCNPY$\n",
+      "\n",
+      "        }\n",
+      "\n",
+      "        \"\"\"\n",
+      "        self.code_gen_dict = {}\n",
+      "\n",
+      "    def get_nodeattr_types(self):\n",
+      "        return {\"code_gen_dir\": (\"s\", False, \"\"), \"executable_path\": (\"s\", False, \"\")}\n",
+      "\n",
+      "    def code_generation(self, model):\n",
+      "        node = self.onnx_node\n",
+      "        self.generate_params(model)\n",
+      "        self.global_includes()\n",
+      "        self.defines()\n",
+      "        self.read_npy_data()\n",
+      "        self.strm_decl()\n",
+      "        self.docompute()\n",
+      "        self.dataoutstrm()\n",
+      "        self.save_as_npy()\n",
+      "\n",
+      "        template = self.docompute_template\n",
+      "\n",
+      "        for key in self.code_gen_dict:\n",
+      "            # transform list into long string separated by '\\n'\n",
+      "            code_gen_line = \"\\n\".join(self.code_gen_dict[key])\n",
+      "            template = template.replace(key, code_gen_line)\n",
+      "        code_gen_dir = self.get_nodeattr(\"code_gen_dir\")\n",
+      "        f = open(os.path.join(code_gen_dir, \"execute_{}.cpp\".format(node.op_type)), \"w\")\n",
+      "        f.write(template)\n",
+      "        f.close()\n",
+      "\n",
+      "    def compile_singlenode_code(self):\n",
+      "        code_gen_dir = self.get_nodeattr(\"code_gen_dir\")\n",
+      "        builder = CppBuilder()\n",
+      "        builder.append_includes(\"-I/workspace/finn/src/finn/data/cpp\")\n",
+      "        builder.append_includes(\"-I/workspace/cnpy/\")\n",
+      "        builder.append_includes(\"-I/workspace/finn-hlslib\")\n",
+      "        builder.append_includes(\"-I/workspace/vivado-hlslib\")\n",
+      "        builder.append_includes(\"--std=c++11\")\n",
+      "        builder.append_sources(code_gen_dir + \"/*.cpp\")\n",
+      "        builder.append_sources(\"/workspace/cnpy/cnpy.cpp\")\n",
+      "        builder.append_includes(\"-lz\")\n",
+      "        builder.set_executable_path(code_gen_dir + \"/node_model\")\n",
+      "        builder.build(code_gen_dir)\n",
+      "        self.set_nodeattr(\"executable_path\", builder.executable_path)\n",
+      "\n",
+      "    def dynamic_input_to_npy(self, context, count):\n",
+      "        node = self.onnx_node\n",
+      "        code_gen_dir = self.get_nodeattr(\"code_gen_dir\")\n",
+      "        if code_gen_dir == \"\":\n",
+      "            raise Exception(\n",
+      "                \"\"\"\n",
+      "Found no codegen dir for this node, did you run the codegen transformation?\n",
+      "            \"\"\"\n",
+      "            )\n",
+      "        # create a npy file for each input of the node (in_ind is input index)\n",
+      "        # assuming dynamic inputs start from 0\n",
+      "        for in_ind in range(count):\n",
+      "            current_input_name = node.input[in_ind]\n",
+      "            np.save(\n",
+      "                os.path.join(code_gen_dir, \"input_{}.npy\".format(in_ind)),\n",
+      "                context[current_input_name],\n",
+      "            )\n",
+      "\n",
+      "    def npy_to_dynamic_output(self, context):\n",
+      "        # TODO support multi-output nodes as needed\n",
+      "        node = self.onnx_node\n",
+      "        code_gen_dir = self.get_nodeattr(\"code_gen_dir\")\n",
+      "        output = np.load(\"{}/output.npy\".format(code_gen_dir))\n",
+      "        context[node.output[0]] = output\n",
+      "\n",
+      "    def exec_precompiled_singlenode_model(self):\n",
+      "        # execute precompiled executable\n",
+      "        executable_path = self.get_nodeattr(\"executable_path\")\n",
+      "        if executable_path == \"\":\n",
+      "            raise Exception(\n",
+      "                \"\"\"\n",
+      "Found no executable for this node, did you run the codegen and\n",
+      "compilation transformations?\n",
+      "            \"\"\"\n",
+      "            )\n",
+      "        process_execute = subprocess.Popen(executable_path, stdout=subprocess.PIPE)\n",
+      "        process_execute.communicate()\n",
+      "\n",
+      "    def execute_node(self, context, graph):\n",
+      "        # save input(s)\n",
+      "        self.dynamic_input_to_npy(context, 1)\n",
+      "        # execute the precompiled model\n",
+      "        self.exec_precompiled_singlenode_model()\n",
+      "        # load output npy file\n",
+      "        self.npy_to_dynamic_output(context)\n",
+      "\n",
+      "    def generate_params(self, model):\n",
+      "        pass\n",
+      "\n",
+      "    @abstractmethod\n",
+      "    def global_includes(self):\n",
+      "        pass\n",
+      "\n",
+      "    @abstractmethod\n",
+      "    def defines(self):\n",
+      "        pass\n",
+      "\n",
+      "    @abstractmethod\n",
+      "    def read_npy_data(self):\n",
+      "        pass\n",
+      "\n",
+      "    @abstractmethod\n",
+      "    def strm_decl(self):\n",
+      "        pass\n",
+      "\n",
+      "    @abstractmethod\n",
+      "    def docompute(self):\n",
+      "        pass\n",
+      "\n",
+      "    @abstractmethod\n",
+      "    def dataoutstrm(self):\n",
+      "        pass\n",
+      "\n",
+      "    @abstractmethod\n",
+      "    def save_as_npy(self):\n",
+      "        pass\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "from finn.custom_op.fpgadataflow import HLSCustomOp\n",
+    "showSrc(HLSCustomOp)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<font size=\"3\">When creating an instance of this class, a template is introduced, which forms the layout for the c++ code to execute the node. It has some general constructs, like the inclusion of bnn-library.h, which contains the references to the finn-hls library, and of cnpy.h and npy2apintstream.hpp, which support the transfer of python numpy arrays in c++. The idea of this template is to replace the variables marked with `$ $` with c++ calls during code generation. Then the template can be written into a .cpp file and be compiled.\n",
+    "\n",
+    "**`get_nodeattr_types()`**: each instance of the HLSCustomOp class must have the attributes `code_gen_dir` and `executable_path`, since to execute these nodes c++ code must be generated and correspondingly the executables.\n",
+    "\n",
+    "</font>\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<font size=\"3\">**`code_generation(model)`**: all functions required for code generation are called and the `$ $` variables in the template are replaced accordingly and written into a .cpp file. Almost all of these subfunctions are implemented as abstract methods in the class, so they are completely customized for each custom op node. A special function is `generate_params()`. This is not implemented as an abstract method, but as a normal function, but contains by default only `pass`. This is because some custom op nodes do not have parameters that need to be generated and in this way the function is skipped. For example for a streaming fc layer node a parameter generation is necessary. How such a parameter generation can look like is described in more detail in the course of this notebook.\n",
+    "</font>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<font size=\"3\">**`compile_singlenode_code()`**: To compile the generated code, the compile command must be built. This is done in this function. It creates an instance of the `CppBuilder()` class and assembles the various components for the function. The `.build` function creates the executable and then sets the corresponding attribute. The class `CppBuilder` is a transformation and a more detailed description can be found in Jupyter notebook *FINN-CodeGenerationAndCompilation*.\n",
+    "</font>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<font size=\"3\">**`dynamic_input_to_npy(context, count)`**:</font>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Generate Parameter\n",
+    "<font size=\"3\">Parameters have to be generated for specific types of HLSCustomOps. For example if the node is a streaming fc layer, there are weights and activation values, which are written to separate .h files and added to the template using `#include`. For streaming fc layer the parameter generation looks like this:\n",
+    "</font>"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {
+    "scrolled": true
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "    def generate_params(self, model):\n",
+      "        # weights\n",
+      "        weights = model.get_initializer(self.onnx_node.input[1])\n",
+      "        # convert weights into hlslib-compatible format\n",
+      "        weight_tensor = self.get_hls_compatible_weight_tensor(weights)\n",
+      "        export_wdt = self.get_weight_datatype()\n",
+      "        # we have converted bipolar weights to binary for export,\n",
+      "        # so use it as such for weight generation\n",
+      "        if self.get_weight_datatype() == DataType.BIPOLAR:\n",
+      "            export_wdt = DataType.BINARY\n",
+      "        weight_hls_code = numpy_to_hls_code(\n",
+      "            weight_tensor, export_wdt, \"weights\", True, True\n",
+      "        )\n",
+      "        # write weights into params.h\n",
+      "        code_gen_dir = self.get_nodeattr(\"code_gen_dir\")\n",
+      "        f_weights = open(\"{}/params.h\".format(code_gen_dir), \"w\")\n",
+      "\n",
+      "        if export_wdt.bitwidth() != 1:\n",
+      "            f_weights.write(\n",
+      "                \"static FixedPointWeights<{},{},{},{}> weights = \".format(\n",
+      "                    self.get_nodeattr(\"SIMD\"),\n",
+      "                    export_wdt.get_hls_datatype_str(),\n",
+      "                    self.get_nodeattr(\"PE\"),\n",
+      "                    self.get_nodeattr(\"WMEM\"),\n",
+      "                )\n",
+      "            )\n",
+      "        else:\n",
+      "            f_weights.write(\n",
+      "                \"static BinaryWeights<{},{},{}> weights = \".format(\n",
+      "                    self.get_nodeattr(\"SIMD\"),\n",
+      "                    self.get_nodeattr(\"PE\"),\n",
+      "                    self.get_nodeattr(\"WMEM\"),\n",
+      "                )\n",
+      "            )\n",
+      "        f_weights.write(weight_hls_code)\n",
+      "        f_weights.close()\n",
+      "        # thresholds\n",
+      "        if len(self.onnx_node.input) > 2:\n",
+      "            thresholds = model.get_initializer(self.onnx_node.input[2])\n",
+      "            if thresholds is not None:\n",
+      "                threshold_tensor = self.get_hls_compatible_threshold_tensor(thresholds)\n",
+      "                tdt = DataType.INT32\n",
+      "                # use UINT32 threshold export for bipolar times bipolar\n",
+      "                inp_is_bipolar = self.get_input_datatype() == DataType.BIPOLAR\n",
+      "                wt_is_bipolar = self.get_weight_datatype() == DataType.BIPOLAR\n",
+      "                if inp_is_bipolar and wt_is_bipolar:\n",
+      "                    tdt = DataType.UINT32\n",
+      "                thresholds_hls_code = numpy_to_hls_code(\n",
+      "                    threshold_tensor, tdt, \"thresholds\", False, True\n",
+      "                )\n",
+      "                # write thresholds into thresh.h\n",
+      "                code_gen_dir = self.get_nodeattr(\"code_gen_dir\")\n",
+      "                f_thresh = open(\"{}/thresh.h\".format(code_gen_dir), \"w\")\n",
+      "                tdt_hls = tdt.get_hls_datatype_str()\n",
+      "                # use binary to export bipolar activations\n",
+      "                export_odt = self.get_output_datatype()\n",
+      "                if self.get_output_datatype() == DataType.BIPOLAR:\n",
+      "                    export_odt = DataType.BINARY\n",
+      "                odt_hls = export_odt.get_hls_datatype_str()\n",
+      "                f_thresh.write(\n",
+      "                    \"static ThresholdsActivation<{},{},{},{},{},{},{}> threshs \\\n",
+      "                     = \".format(\n",
+      "                        self.get_nodeattr(\"TMEM\"),\n",
+      "                        self.get_nodeattr(\"PE\"),\n",
+      "                        threshold_tensor.shape[-1],\n",
+      "                        tdt_hls,\n",
+      "                        odt_hls,\n",
+      "                        self.get_nodeattr(\"ActVal\"),\n",
+      "                        \"std::less_equal<%s>\" % tdt_hls,\n",
+      "                    )\n",
+      "                )\n",
+      "                f_thresh.write(thresholds_hls_code)\n",
+      "                f_thresh.close()\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "from finn.custom_op.fpgadataflow.streamingfclayer_batch import StreamingFCLayer_Batch\n",
+    "showSrc(StreamingFCLayer_Batch.generate_params)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.6.8"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}