[Notebook] Update and rerun basics Jupyter notebooks

dec3aa75 · auphelia · 87b1bd45 · dec3aa75 · dec3aa75
Commit dec3aa75 authored 4 years ago by auphelia
--- a/notebooks/basics/0_how_to_work_with_onnx.ipynb
+++ b/notebooks/basics/0_how_to_work_with_onnx.ipynb
@@ -6,7 +6,7 @@
   "source": [
    "# FINN - How to work with ONNX\n",
    "\n",
-    "This notebook should give an overview of ONNX ProtoBuf, help to create and manipulate an ONNX model and use FINN functions to work with it. There may be overlaps to notebook [ModelWrapper](2_modelwrapper.ipynb), but this notebook will give an overview about the handling of ONNX models in FINN."
+    "This notebook should give an overview of ONNX ProtoBuf, help to create and manipulate an ONNX model and use FINN functions to work with it."
   ]
  },
  {
@@ -14,7 +14,7 @@
   "metadata": {},
   "source": [
    "## Outline\n",
-    "* #### How to create a simple model\n",
+    "* #### How to create a simple ONNX model\n",
    "* #### How to manipulate an ONNX model"
   ]
  },
@@ -22,7 +22,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "### How to create a simple model\n",
+    "### How to create a simple ONNX model\n",
    "\n",
    "To explain how to create an ONNX model a simple example with mathematical operations is used. All nodes are from the [standard operations library of ONNX](https://github.com/onnx/onnx/blob/master/docs/Operators.md).\n",
    "\n",
@@ -93,7 +93,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "The names of the inputs and outputs of the nodes give already an idea of the structure of the resulting graph. In order to integrate the nodes into a graph environment, the inputs and outputs of the graph have to be specified first. In ONNX all data edges are processed as tensors. So with the helper function tensor value infos are created for the input and output tensors of the graph. Float from ONNX is used as data type. "
+    "The names of the inputs and outputs of the nodes give already an idea of the structure of the resulting graph. In order to integrate the nodes into a graph environment, the inputs and outputs of the graph have to be specified first. In ONNX all data edges are processed as tensors. So with onnx helper function tensors value infos are created for the input and output tensors of the graph. Float from ONNX is used as data type. "
   ]
  },
  {
@@ -159,14 +159,14 @@
   "outputs": [],
   "source": [
    "onnx_model = onnx.helper.make_model(graph, producer_name=\"simple-model\")\n",
-    "onnx.save(onnx_model, 'simple_model.onnx')"
+    "onnx.save(onnx_model, '/tmp/simple_model.onnx')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "To visualize the created model, [netron](https://github.com/lutzroeder/netron) can be used. Netron is a visualizer for neural network, deep learning and machine learning models."
+    "To visualize the created model, [netron](https://github.com/lutzroeder/netron) can be used. Netron is a visualizer for neural network, deep learning and machine learning models. FINN provides a utility function for visualization with netron, which we import and use in the following."
   ]
  },
  {
@@ -189,7 +189,7 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "Serving 'simple_model.onnx' at http://0.0.0.0:8081\n"
+      "Serving '/tmp/simple_model.onnx' at http://0.0.0.0:8081\n"
     ]
    },
    {
@@ -206,7 +206,7 @@
       "        "
      ],
      "text/plain": [
-       "<IPython.lib.display.IFrame at 0x7fb9303c7b38>"
+       "<IPython.lib.display.IFrame at 0x7fcdfc956b70>"
      ]
     },
     "execution_count": 7,
@@ -215,7 +215,7 @@
    }
   ],
   "source": [
-    "showInNetron('simple_model.onnx')"
+    "showInNetron('/tmp/simple_model.onnx')"
   ]
  },
  {
@@ -284,7 +284,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "To run the model and calculate the output, [onnxruntime](https://github.com/microsoft/onnxruntime) can be used. ONNX Runtime is a performance-focused complete scoring engine for Open Neural Network Exchange (ONNX) models from Microsoft. The `.InferenceSession` function is used to create a session of the model and `.run` is used to execute the model."
+    "To run the model and calculate the output, [onnxruntime](https://github.com/microsoft/onnxruntime) can be used. ONNX Runtime is a performance-focused complete scoring engine for ONNX models from Microsoft. The `.InferenceSession` function is used to create a session of the model and `.run` is used to execute the model."
   ]
  },
  {
@@ -316,16 +316,16 @@
     "output_type": "stream",
     "text": [
      "The output of the ONNX model is: \n",
-      "[[ 1. 16.  3. 10.]\n",
-      " [ 5. 17. 17. 13.]\n",
-      " [ 3. 11. 10. 17.]\n",
-      " [ 9.  2.  4.  8.]]\n",
+      "[[22. 13. 21.  8.]\n",
+      " [ 0.  8. 11.  1.]\n",
+      " [ 3. 12.  8.  2.]\n",
+      " [ 0.  6.  1.  4.]]\n",
      "\n",
      "The output of the reference function is: \n",
-      "[[ 1. 16.  3. 10.]\n",
-      " [ 5. 17. 17. 13.]\n",
-      " [ 3. 11. 10. 17.]\n",
-      " [ 9.  2.  4.  8.]]\n",
+      "[[22. 13. 21.  8.]\n",
+      " [ 0.  8. 11.  1.]\n",
+      " [ 3. 12.  8.  2.]\n",
+      " [ 0.  6.  1.  4.]]\n",
      "\n",
      "The results are the same!\n"
     ]
@@ -364,7 +364,7 @@
   "source": [
    "In the following we assume that we do not know the appearance of the model, so we first try to identify whether there are two consecutive adders in the graph and then convert them into a sum node. \n",
    "\n",
-    "Here we make use of FINN. FINN provides a thin wrapper around the model which provides several additional helper functions to manipulate the graph. The code can be found [here](https://github.com/Xilinx/finn/blob/master/src/finn/core/modelwrapper.py) and you can find a more detailed description in the notebook [ModelWrapper](2_modelwrapper.ipynb)."
+    "Here we make use of FINN. FINN provides a thin wrapper around the model which provides several additional helper functions to manipulate the graph. The code can be found [here](https://github.com/Xilinx/finn/blob/master/src/finn/core/modelwrapper.py)."
   ]
  },
  {
@@ -656,17 +656,17 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 24,
+   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
    "onnx_model1 = onnx.helper.make_model(graph, producer_name=\"simple-model1\")\n",
-    "onnx.save(onnx_model1, 'simple_model1.onnx')"
+    "onnx.save(onnx_model1, '/tmp/simple_model1.onnx')"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 25,
+   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
@@ -675,7 +675,7 @@
     "text": [
      "\n",
      "Stopping http://0.0.0.0:8081\n",
-      "Serving 'simple_model1.onnx' at http://0.0.0.0:8081\n"
+      "Serving '/tmp/simple_model1.onnx' at http://0.0.0.0:8081\n"
     ]
    },
    {
@@ -692,16 +692,16 @@
       "        "
      ],
      "text/plain": [
-       "<IPython.lib.display.IFrame at 0x7fb93018f9e8>"
+       "<IPython.lib.display.IFrame at 0x7fcdfc130cc0>"
      ]
     },
-     "execution_count": 25,
+     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "showInNetron('simple_model1.onnx')"
+    "showInNetron('/tmp/simple_model1.onnx')"
   ]
  },
  {
@@ -713,7 +713,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 26,
+   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -723,7 +723,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 27,
+   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
@@ -731,16 +731,16 @@
     "output_type": "stream",
     "text": [
      "The output of the manipulated ONNX model is: \n",
-      "[[ 1. 16.  3. 10.]\n",
-      " [ 5. 17. 17. 13.]\n",
-      " [ 3. 11. 10. 17.]\n",
-      " [ 9.  2.  4.  8.]]\n",
+      "[[22. 13. 21.  8.]\n",
+      " [ 0.  8. 11.  1.]\n",
+      " [ 3. 12.  8.  2.]\n",
+      " [ 0.  6.  1.  4.]]\n",
      "\n",
      "The output of the reference function is: \n",
-      "[[ 1. 16.  3. 10.]\n",
-      " [ 5. 17. 17. 13.]\n",
-      " [ 3. 11. 10. 17.]\n",
-      " [ 9.  2.  4.  8.]]\n",
+      "[[22. 13. 21.  8.]\n",
+      " [ 0.  8. 11.  1.]\n",
+      " [ 3. 12.  8.  2.]\n",
+      " [ 0.  6.  1.  4.]]\n",
      "\n",
      "The results are the same!\n"
     ]

 %% Cell type:markdown id: tags:

 # FINN - How to work with ONNX

-This notebook should give an overview of ONNX ProtoBuf, help to create and manipulate an ONNX model and use FINN functions to work with it. There may be overlaps to notebook [ModelWrapper](2_modelwrapper.ipynb), but this notebook will give an overview about the handling of ONNX models in FINN.
+This notebook should give an overview of ONNX ProtoBuf, help to create and manipulate an ONNX model and use FINN functions to work with it.

 %% Cell type:markdown id: tags:

 ## Outline
-* #### How to create a simple model
+* #### How to create a simple ONNX model
 * #### How to manipulate an ONNX model

 %% Cell type:markdown id: tags:

-### How to create a simple model
+### How to create a simple ONNX model

 To explain how to create an ONNX model a simple example with mathematical operations is used. All nodes are from the [standard operations library of ONNX](https://github.com/onnx/onnx/blob/master/docs/Operators.md).

 First ONNX is imported, then the helper function can be used to make a node.

 %% Cell type:code id: tags:

 ``` python
 import onnx

 Add1_node = onnx.helper.make_node(
    'Add',
    inputs=['in1', 'in2'],
    outputs=['sum1'],
    name='Add1'
 )
 ```

 %% Cell type:markdown id: tags:

 The first attribute of the node is the operation type. In this case it is `'Add'`, so it is an adder node. Then the input names are passed to the node and at the end a name is assigned to the output.

 For this example we want two other adder nodes, one abs node and the output shall be rounded so one round node is needed.

 %% Cell type:code id: tags:

 ``` python
 Add2_node = onnx.helper.make_node(
    'Add',
    inputs=['sum1', 'in3'],
    outputs=['sum2'],
    name='Add2',
 )

 Add3_node = onnx.helper.make_node(
    'Add',
    inputs=['abs1', 'abs1'],
    outputs=['sum3'],
    name='Add3',
 )

 Abs_node = onnx.helper.make_node(
    'Abs',
    inputs=['sum2'],
    outputs=['abs1'],
    name='Abs'
 )

 Round_node = onnx.helper.make_node(
    'Round',
    inputs=['sum3'],
    outputs=['out1'],
    name='Round',
 )
 ```

 %% Cell type:markdown id: tags:

-The names of the inputs and outputs of the nodes give already an idea of the structure of the resulting graph. In order to integrate the nodes into a graph environment, the inputs and outputs of the graph have to be specified first. In ONNX all data edges are processed as tensors. So with the helper function tensor value infos are created for the input and output tensors of the graph. Float from ONNX is used as data type.
+The names of the inputs and outputs of the nodes give already an idea of the structure of the resulting graph. In order to integrate the nodes into a graph environment, the inputs and outputs of the graph have to be specified first. In ONNX all data edges are processed as tensors. So with onnx helper function tensors value infos are created for the input and output tensors of the graph. Float from ONNX is used as data type.

 %% Cell type:code id: tags:

 ``` python
 in1 = onnx.helper.make_tensor_value_info("in1", onnx.TensorProto.FLOAT, [4, 4])
 in2 = onnx.helper.make_tensor_value_info("in2", onnx.TensorProto.FLOAT, [4, 4])
 in3 = onnx.helper.make_tensor_value_info("in3", onnx.TensorProto.FLOAT, [4, 4])
 out1 = onnx.helper.make_tensor_value_info("out1", onnx.TensorProto.FLOAT, [4, 4])
 ```

 %% Cell type:markdown id: tags:

 Now the graph can be built. First all nodes are passed. Here it is to be noted that it requires a certain sequence. The nodes must be instantiated in their dependencies to each other. This means Add2 must not be listed before Add1, because Add2 depends on the result of Add1. A name is then assigned to the graph. This is followed by the inputs and outputs.

 `value_info` of the graph contains the remaining tensors within the graph. When creating the nodes we have already defined names for the inner data edges and now these are assigned tensors of the datatype float and a certain shape.

 %% Cell type:code id: tags:

 ``` python
 graph = onnx.helper.make_graph(
        nodes=[
            Add1_node,
            Add2_node,
            Abs_node,
            Add3_node,
            Round_node,
        ],
        name="simple_graph",
        inputs=[in1, in2, in3],
        outputs=[out1],
        value_info=[
            onnx.helper.make_tensor_value_info("sum1", onnx.TensorProto.FLOAT, [4, 4]),
            onnx.helper.make_tensor_value_info("sum2", onnx.TensorProto.FLOAT, [4, 4]),
            onnx.helper.make_tensor_value_info("abs1", onnx.TensorProto.FLOAT, [4, 4]),
            onnx.helper.make_tensor_value_info("sum3", onnx.TensorProto.FLOAT, [4, 4]),
        ],
    )
 ```

 %% Cell type:markdown id: tags:

 **Important**: In our example, the shape of the tensors does not change during the calculation. This is not always the case. So you have to make sure that you specify the shape correctly.

 Now a model can be created from the graph and saved using the `.save` function. The model is saved in .onnx format and can be reloaded with `onnx.load()`. This also means that you can easily share your own model in .onnx format with others.

 %% Cell type:code id: tags:

 ``` python
 onnx_model = onnx.helper.make_model(graph, producer_name="simple-model")
-onnx.save(onnx_model, 'simple_model.onnx')
+onnx.save(onnx_model, '/tmp/simple_model.onnx')
 ```

 %% Cell type:markdown id: tags:

-To visualize the created model, [netron](https://github.com/lutzroeder/netron) can be used. Netron is a visualizer for neural network, deep learning and machine learning models.
+To visualize the created model, [netron](https://github.com/lutzroeder/netron) can be used. Netron is a visualizer for neural network, deep learning and machine learning models. FINN provides a utility function for visualization with netron, which we import and use in the following.

 %% Cell type:code id: tags:

 ``` python
 from finn.util.visualization import showInNetron
 ```

 %% Cell type:code id: tags:

 ``` python
-showInNetron('simple_model.onnx')
+showInNetron('/tmp/simple_model.onnx')
 ```

 %% Output

-    Serving 'simple_model.onnx' at http://0.0.0.0:8081
+    Serving '/tmp/simple_model.onnx' at http://0.0.0.0:8081

-    <IPython.lib.display.IFrame at 0x7fb9303c7b38>
+    <IPython.lib.display.IFrame at 0x7fcdfc956b70>

 %% Cell type:markdown id: tags:

 Netron also allows you to interactively explore the model. If you click on a node, the node attributes will be displayed.

 In order to test the resulting model, a function is first written in Python that calculates the expected output. Because numpy arrays are to be used, numpy is imported first.

 %% Cell type:code id: tags:

 ``` python
 import numpy as np

 def expected_output(in1, in2, in3):
    sum1 = np.add(in1, in2)
    sum2 = np.add(sum1, in3)
    abs1 = np.absolute(sum2)
    sum3 = np.add(abs1, abs1)
    return np.round(sum3)
 ```

 %% Cell type:markdown id: tags:

 Then the values for the three inputs are calculated. Random numbers are used.

 %% Cell type:code id: tags:

 ``` python
 in1_values =np.asarray(np.random.uniform(low=-5, high=5, size=(4,4)), dtype=np.float32)
 in2_values = np.asarray(np.random.uniform(low=-5, high=5, size=(4,4)), dtype=np.float32)
 in3_values = np.asarray(np.random.uniform(low=-5, high=5, size=(4,4)), dtype=np.float32)
 ```

 %% Cell type:markdown id: tags:

 We can easily pass the values to the function we just wrote to calculate the expected result. For the created model the inputs must be summarized in a dictionary, which is then passed on to the model.

 %% Cell type:code id: tags:

 ``` python
 input_dict = {}
 input_dict["in1"] = in1_values
 input_dict["in2"] = in2_values
 input_dict["in3"] = in3_values
 ```

 %% Cell type:markdown id: tags:

-To run the model and calculate the output, [onnxruntime](https://github.com/microsoft/onnxruntime) can be used. ONNX Runtime is a performance-focused complete scoring engine for Open Neural Network Exchange (ONNX) models from Microsoft. The `.InferenceSession` function is used to create a session of the model and `.run` is used to execute the model.
+To run the model and calculate the output, [onnxruntime](https://github.com/microsoft/onnxruntime) can be used. ONNX Runtime is a performance-focused complete scoring engine for ONNX models from Microsoft. The `.InferenceSession` function is used to create a session of the model and `.run` is used to execute the model.

 %% Cell type:code id: tags:

 ``` python
 import onnxruntime as rt

 sess = rt.InferenceSession(onnx_model.SerializeToString())
 output = sess.run(None, input_dict)
 ```

 %% Cell type:markdown id: tags:

 The input values are also transferred to the reference function. Now the output of the execution of the model can be compared with that of the reference.

 %% Cell type:code id: tags:

 ``` python
 ref_output= expected_output(in1_values, in2_values, in3_values)
 print("The output of the ONNX model is: \n{}".format(output[0]))
 print("\nThe output of the reference function is: \n{}".format(ref_output))

 if (output[0] == ref_output).all():
    print("\nThe results are the same!")
 else:
    raise Exception("Something went wrong, the output of the model doesn't match the expected output!")
 ```

 %% Output

    The output of the ONNX model is:
-    [[ 1. 16.  3. 10.]
-     [ 5. 17. 17. 13.]
-     [ 3. 11. 10. 17.]
-     [ 9.  2.  4.  8.]]
+    [[22. 13. 21.  8.]
+     [ 0.  8. 11.  1.]
+     [ 3. 12.  8.  2.]
+     [ 0.  6.  1.  4.]]
    
    The output of the reference function is:
-    [[ 1. 16.  3. 10.]
-     [ 5. 17. 17. 13.]
-     [ 3. 11. 10. 17.]
-     [ 9.  2.  4.  8.]]
+    [[22. 13. 21.  8.]
+     [ 0.  8. 11.  1.]
+     [ 3. 12.  8.  2.]
+     [ 0.  6.  1.  4.]]
    
    The results are the same!

 %% Cell type:markdown id: tags:

 Now that we have verified that the model works as we expected it to, we can continue working with the graph.

 %% Cell type:markdown id: tags:

 ### How to manipulate an ONNX model

 In the model there are two successive adder nodes. An adder node in ONNX can only add two inputs, but there is also the [**sum**](https://github.com/onnx/onnx/blob/master/docs/Operators.md#Sum) node, which can process more than two inputs. So it would be a reasonable change of the graph to combine the two successive adder nodes to one sum node.

 %% Cell type:markdown id: tags:

 In the following we assume that we do not know the appearance of the model, so we first try to identify whether there are two consecutive adders in the graph and then convert them into a sum node.

-Here we make use of FINN. FINN provides a thin wrapper around the model which provides several additional helper functions to manipulate the graph. The code can be found [here](https://github.com/Xilinx/finn/blob/master/src/finn/core/modelwrapper.py) and you can find a more detailed description in the notebook [ModelWrapper](2_modelwrapper.ipynb).
+Here we make use of FINN. FINN provides a thin wrapper around the model which provides several additional helper functions to manipulate the graph. The code can be found [here](https://github.com/Xilinx/finn/blob/master/src/finn/core/modelwrapper.py).

 %% Cell type:code id: tags:

 ``` python
 from finn.core.modelwrapper import ModelWrapper
 finn_model = ModelWrapper(onnx_model)
 ```

 %% Cell type:markdown id: tags:

 As explained in the previous section, it is important that the nodes are listed in the correct order. If a new node has to be inserted or an old node has to be replaced, it is important to do that in the appropriate position. The following function serves this purpose. It returns a dictionary, which contains the node name as key and the respective node index as value.

 %% Cell type:code id: tags:

 ``` python
 def get_node_id(model):
    node_index = {}
    node_ind = 0
    for node in model.graph.node:
        node_index[node.name] = node_ind
        node_ind += 1
    return node_index
 ```

 %% Cell type:markdown id: tags:

 The function scans the list of nodes and stores a run index (`node_ind`) as node index in the dictionary for every node name.

 Another helper function is being implemented that searches for adder nodes in the graph and returns the found nodes. This is needed to determine if and which adder nodes are in the given model.

 %% Cell type:code id: tags:

 ``` python
 def identify_adder_nodes(model):
    add_nodes = []
    for node in model.graph.node:
        if node.op_type == "Add":
            add_nodes.append(node)
    return add_nodes
 ```

 %% Cell type:markdown id: tags:

 The function iterates over all nodes of the model and if the operation type is `"Add"` the node will be stored in `add_nodes`. At the end `add_nodes` is returned.

 If we apply this to our model, three nodes should be returned.

 %% Cell type:code id: tags:

 ``` python
 add_nodes = identify_adder_nodes(finn_model)
 for node in add_nodes:
    print("Found adder node: {}".format(node.name))
 ```

 %% Output

    Found adder node: Add1
    Found adder node: Add2
    Found adder node: Add3

 %% Cell type:markdown id: tags:

 Among other helper functions, `ModelWrapper` offers two functions that can help to determine the preceding and succeeding node of a node. However, these functions are not getting a node as input, but can determine the consumer or producer of a tensor. We write two functions that uses these helper functions to determine the previous and the next node of a node.

 %% Cell type:code id: tags:

 ``` python
 def find_predecessor(model, node):
    predecessors = []
    for i in range(len(node.input)):
        producer = model.find_producer(node.input[i])
        predecessors.append(producer)
    return predecessors


 def find_successor(model, node):
    successors = []
    for i in range(len(node.output)):
        consumer = model.find_consumer(node.output[i])
        successors.append(consumer)
    return successors
 ```

 %% Cell type:markdown id: tags:

 The first function uses `find_producer` from `ModelWrapper` to create a list of the producers of the inputs of the given node. So the returned list is indirectly filled with the predecessors of the node. The second function works in a similar way, `find_consumer` from `ModelWrapper` is used to find the consumers of the output tensors of the node and so a list with the successors can be created.

 %% Cell type:code id: tags:

 ``` python
 def adder_pair(model, node):
    adder_pairs = []
    node_pair = []
    successor_list = find_successor(model, node)

    for successor in successor_list:
        if successor.op_type == "Add":
            node_pair.append(node)
            node_pair.append(successor)
            adder_pairs.append((node_pair))
            node_pair = []
    return adder_pairs

 ```

 %% Cell type:markdown id: tags:

 The function gets a node and the model as input. Two empty lists are created to be filled with a list of adder node pairs that can be returned as result of the function. Then the function `find_successor` is used to return all of the successors of the node. If one of the successors is an adder node, the node is saved in `node_pair` together with the successive adder node and put in the list `adder_pairs`. Then the temporary list is cleaned and can be filled with the next adder node pair. Since it is theoretically possible for an adder node to have more than one subsequent adder node, a list of lists is created. This list of the node with all its successive adder nodes is returned.

 So now we can find out which adder node has an adder node as successor. Since the model is known, one adder pair (Add1+Add2) should be found when applying the function to the previously determined adder node list (`add_nodes`).

 %% Cell type:code id: tags:

 ``` python
 for node in add_nodes:
    add_pairs = adder_pair(finn_model, node)
    if len(add_pairs) != 0:
        for i in range(len(add_pairs)):
            substitute_pair = add_pairs[i]
            print("Found following pair that could be replaced by a sum node:")
            for node_pair in add_pairs:
                for node in node_pair:
                    print(node.name)
 ```

 %% Output

    Found following pair that could be replaced by a sum node:
    Add1
    Add2

 %% Cell type:markdown id: tags:

 Now that the pair to be replaced has been identified (`substitute_pair`), a sum node can be instantiated and inserted into the graph at the correct position.

 First of all, the inputs must be determined. For this the adder nodes inputs are used minus the input, which corresponds to the output of the other adder node.

 %% Cell type:code id: tags:

 ``` python
 input_list = []
 for i in range(len(substitute_pair)):
    if i == 0:
        for j in range(len(substitute_pair[i].input)):
            if substitute_pair[i].input[j] != substitute_pair[i+1].output[0]:
                input_list.append(substitute_pair[i].input[j])
    else:
        for j in range(len(substitute_pair[i].input)):
            if substitute_pair[i].input[j] != substitute_pair[i-1].output[0]:
                input_list.append(substitute_pair[i].input[j])
 print("The new node gets the following inputs: \n{}".format(input_list))
 ```

 %% Output

    The new node gets the following inputs:
    ['in1', 'in2', 'in3']

 %% Cell type:markdown id: tags:

 The output of the sum node matches the output of the second adder node and can therefore be taken over directly.

 %% Cell type:code id: tags:

 ``` python
 sum_output = substitute_pair[1].output[0]
 ```

 %% Cell type:markdown id: tags:

 The summary node can be created with this information.

 %% Cell type:code id: tags:

 ``` python
 Sum_node = onnx.helper.make_node(
    'Sum',
    inputs=input_list,
    outputs=[sum_output],
    name="Sum"
 )
 ```

 %% Cell type:markdown id: tags:

 The node can now be inserted into the graph and the old nodes are removed.

 %% Cell type:code id: tags:

 ``` python
 node_ids = get_node_id(finn_model)
 node_ind = node_ids[substitute_pair[0].name]
 graph.node.insert(node_ind, Sum_node)

 for node in substitute_pair:
    graph.node.remove(node)
 ```

 %% Cell type:markdown id: tags:

 To insert the node in the right place, the index of the first node of the substitute_pair is used as node index for the sum node and embedded into the graph using `.insert`. Then the two elements in `substitute_pair` are deleted using `.remove`. `.insert` and `.remove` are functions provided by ONNX.

 %% Cell type:markdown id: tags:

 The new graph is saved as ONNX model and can be visualized with Netron.

 %% Cell type:code id: tags:

 ``` python
 onnx_model1 = onnx.helper.make_model(graph, producer_name="simple-model1")
-onnx.save(onnx_model1, 'simple_model1.onnx')
+onnx.save(onnx_model1, '/tmp/simple_model1.onnx')
 ```

 %% Cell type:code id: tags:

 ``` python
-showInNetron('simple_model1.onnx')
+showInNetron('/tmp/simple_model1.onnx')
 ```

 %% Output

    
    Stopping http://0.0.0.0:8081
-    Serving 'simple_model1.onnx' at http://0.0.0.0:8081
+    Serving '/tmp/simple_model1.onnx' at http://0.0.0.0:8081

-    <IPython.lib.display.IFrame at 0x7fb93018f9e8>
+    <IPython.lib.display.IFrame at 0x7fcdfc130cc0>

 %% Cell type:markdown id: tags:

 Through the visualization it can already be seen that the insertion was successful, but it is still to be checked whether the result remains the same. Therefore the result of the reference function written in the previous section is used and the new model with the input values is simulated. At this point onnxruntime can be used again. The simulation is analogous to the one of the first model in the previous section.

 %% Cell type:code id: tags:

 ``` python
 sess = rt.InferenceSession(onnx_model1.SerializeToString())
 output = sess.run(None, input_dict)
 ```

 %% Cell type:code id: tags:

 ``` python
 print("The output of the manipulated ONNX model is: \n{}".format(output[0]))
 print("\nThe output of the reference function is: \n{}".format(ref_output))

 if (output[0] == ref_output).all():
    print("\nThe results are the same!")
 else:
    raise Exception("Something went wrong, the output of the model doesn't match the expected output!")
 ```

 %% Output

    The output of the manipulated ONNX model is:
-    [[ 1. 16.  3. 10.]
-     [ 5. 17. 17. 13.]
-     [ 3. 11. 10. 17.]
-     [ 9.  2.  4.  8.]]
+    [[22. 13. 21.  8.]
+     [ 0.  8. 11.  1.]
+     [ 3. 12.  8.  2.]
+     [ 0.  6.  1.  4.]]
    
    The output of the reference function is:
-    [[ 1. 16.  3. 10.]
-     [ 5. 17. 17. 13.]
-     [ 3. 11. 10. 17.]
-     [ 9.  2.  4.  8.]]
+    [[22. 13. 21.  8.]
+     [ 0.  8. 11.  1.]
+     [ 3. 12.  8.  2.]
+     [ 0.  6.  1.  4.]]
    
    The results are the same!

--- a/notebooks/basics/1_brevitas_network_import.ipynb
+++ b/notebooks/basics/1_brevitas_network_import.ipynb
@@ -12,7 +12,7 @@
    "2. Call Brevitas FINN-ONNX export and visualize with Netron\n",
    "3. Import into FINN and call cleanup transformations\n",
    "\n",
-    "We'll use the following showSrc function to print the source code for function calls in the Jupyter notebook:"
+    "We'll use the following utility functions to print the source code for function calls (`showSrc()`) and to visualize a network using netron (`showInNetron()`) in the Jupyter notebook:"
   ]
  },
  {
@@ -442,7 +442,7 @@
       "        "
      ],
      "text/plain": [
-       "<IPython.lib.display.IFrame at 0x7f86cdb6e5f8>"
+       "<IPython.lib.display.IFrame at 0x7f3d330b6ac8>"
      ]
     },
     "execution_count": 9,
@@ -624,7 +624,7 @@
       "        "
      ],
      "text/plain": [
-       "<IPython.lib.display.IFrame at 0x7f86cdb6ec18>"
+       "<IPython.lib.display.IFrame at 0x7f3d3380aef0>"
      ]
     },
     "execution_count": 15,

 %% Cell type:markdown id: tags:

 # Importing Brevitas networks into FINN

 In this notebook we'll go through an example of how to import a Brevitas-trained QNN into FINN. The steps will be as follows:

 1. Load up the trained PyTorch model
 2. Call Brevitas FINN-ONNX export and visualize with Netron
 3. Import into FINN and call cleanup transformations

-We'll use the following showSrc function to print the source code for function calls in the Jupyter notebook:
+We'll use the following utility functions to print the source code for function calls (`showSrc()`) and to visualize a network using netron (`showInNetron()`) in the Jupyter notebook:

 %% Cell type:code id: tags:

 ``` python
 import onnx
 from finn.util.visualization import showSrc, showInNetron
 ```

 %% Cell type:markdown id: tags:

 ## 1. Load up the trained PyTorch model

 The FINN Docker image comes with several [example Brevitas networks](https://github.com/maltanar/brevitas_cnv_lfc), and we'll use the LFC-w1a1 model as the example network here. This is a binarized fully connected network trained on the MNIST dataset. Let's start by looking at what the PyTorch network definition looks like:

 %% Cell type:code id: tags:

 ``` python
 from models.LFC import LFC
 showSrc(LFC)
 ```

 %% Output

    class LFC(Module):
    
        def __init__(self, num_classes=10, weight_bit_width=None, act_bit_width=None,
                     in_bit_width=None, in_ch=1, in_features=(28, 28), device="cpu"):
            super(LFC, self).__init__()
            self.device = device
    
            weight_quant_type = get_quant_type(weight_bit_width)
            act_quant_type = get_quant_type(act_bit_width)
            in_quant_type = get_quant_type(in_bit_width)
            stats_op = get_stats_op(weight_quant_type)
    
            self.features = ModuleList()
            self.features.append(get_act_quant(in_bit_width, in_quant_type))
            self.features.append(Dropout(p=IN_DROPOUT))
            in_features = reduce(mul, in_features)
            for out_features in FC_OUT_FEATURES:
                self.features.append(get_quant_linear(in_features=in_features,
                                                      out_features=out_features,
                                                      per_out_ch_scaling=INTERMEDIATE_FC_PER_OUT_CH_SCALING,
                                                      bit_width=weight_bit_width,
                                                      quant_type=weight_quant_type,
                                                      stats_op=stats_op))
                in_features = out_features
                self.features.append(BatchNorm1d(num_features=in_features))
                self.features.append(get_act_quant(act_bit_width, act_quant_type))
                self.features.append(Dropout(p=HIDDEN_DROPOUT))
            self.features.append(get_quant_linear(in_features=in_features,
                                       out_features=num_classes,
                                       per_out_ch_scaling=LAST_FC_PER_OUT_CH_SCALING,
                                       bit_width=weight_bit_width,
                                       quant_type=weight_quant_type,
                                       stats_op=stats_op))
            self.features.append(BatchNorm1d(num_features=num_classes))
    
            for m in self.modules():
              if isinstance(m, QuantLinear):
                torch.nn.init.uniform_(m.weight.data, -1, 1)
    
        def clip_weights(self, min_val, max_val):
            for mod in self.features:
                if isinstance(mod, QuantLinear):
                    mod.weight.data.clamp_(min_val, max_val)
    
        def forward(self, x):
            x = x.view(x.shape[0], -1)
            x = 2.0 * x - torch.tensor([1.0]).to(self.device)
            for mod in self.features:
                x = mod(x)
            return x
    

 %% Cell type:markdown id: tags:

 We can see that the network topology is constructed using a few helper functions that generate the quantized linear layers and quantized activations. The bitwidth of the layers is actually parametrized in the constructor, so let's instantiate a 1-bit weights and activations version of this network. We also have pretrained weights for this network, which we will load into the model.

 %% Cell type:code id: tags:

 ``` python
 import torch

 trained_lfc_w1a1_checkpoint = "/workspace/brevitas_cnv_lfc/pretrained_models/LFC_1W1A/checkpoints/best.tar"
 lfc = LFC(weight_bit_width=1, act_bit_width=1, in_bit_width=1).eval()
 checkpoint = torch.load(trained_lfc_w1a1_checkpoint, map_location="cpu")
 lfc.load_state_dict(checkpoint["state_dict"])
 lfc
 ```

 %% Output

    LFC(
      (features): ModuleList(
        (0): QuantHardTanh(
          (act_quant_proxy): ActivationQuantProxy(
            (fused_activation_quant_proxy): FusedActivationQuantProxy(
              (activation_impl): Identity()
              (tensor_quant): ClampedBinaryQuant(
                (scaling_impl): StandaloneScaling(
                  (restrict_value): RestrictValue(
                    (forward_impl): Sequential(
                      (0): PowerOfTwo()
                      (1): ClampMin()
                    )
                  )
                )
              )
            )
          )
        )
        (1): Dropout(p=0.2)
        (2): QuantLinear(
          in_features=784, out_features=1024, bias=False
          (weight_reg): WeightReg()
          (weight_quant): WeightQuantProxy(
            (tensor_quant): BinaryQuant(
              (scaling_impl): StandaloneScaling(
                (restrict_value): RestrictValue(
                  (forward_impl): Sequential(
                    (0): PowerOfTwo()
                    (1): Identity()
                  )
                )
              )
            )
          )
          (bias_quant): BiasQuantProxy()
        )
        (3): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (4): QuantHardTanh(
          (act_quant_proxy): ActivationQuantProxy(
            (fused_activation_quant_proxy): FusedActivationQuantProxy(
              (activation_impl): Identity()
              (tensor_quant): ClampedBinaryQuant(
                (scaling_impl): StandaloneScaling(
                  (restrict_value): RestrictValue(
                    (forward_impl): Sequential(
                      (0): PowerOfTwo()
                      (1): ClampMin()
                    )
                  )
                )
              )
            )
          )
        )
        (5): Dropout(p=0.2)
        (6): QuantLinear(
          in_features=1024, out_features=1024, bias=False
          (weight_reg): WeightReg()
          (weight_quant): WeightQuantProxy(
            (tensor_quant): BinaryQuant(
              (scaling_impl): StandaloneScaling(
                (restrict_value): RestrictValue(
                  (forward_impl): Sequential(
                    (0): PowerOfTwo()
                    (1): Identity()
                  )
                )
              )
            )
          )
          (bias_quant): BiasQuantProxy()
        )
        (7): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (8): QuantHardTanh(
          (act_quant_proxy): ActivationQuantProxy(
            (fused_activation_quant_proxy): FusedActivationQuantProxy(
              (activation_impl): Identity()
              (tensor_quant): ClampedBinaryQuant(
                (scaling_impl): StandaloneScaling(
                  (restrict_value): RestrictValue(
                    (forward_impl): Sequential(
                      (0): PowerOfTwo()
                      (1): ClampMin()
                    )
                  )
                )
              )
            )
          )
        )
        (9): Dropout(p=0.2)
        (10): QuantLinear(
          in_features=1024, out_features=1024, bias=False
          (weight_reg): WeightReg()
          (weight_quant): WeightQuantProxy(
            (tensor_quant): BinaryQuant(
              (scaling_impl): StandaloneScaling(
                (restrict_value): RestrictValue(
                  (forward_impl): Sequential(
                    (0): PowerOfTwo()
                    (1): Identity()
                  )
                )
              )
            )
          )
          (bias_quant): BiasQuantProxy()
        )
        (11): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (12): QuantHardTanh(
          (act_quant_proxy): ActivationQuantProxy(
            (fused_activation_quant_proxy): FusedActivationQuantProxy(
              (activation_impl): Identity()
              (tensor_quant): ClampedBinaryQuant(
                (scaling_impl): StandaloneScaling(
                  (restrict_value): RestrictValue(
                    (forward_impl): Sequential(
                      (0): PowerOfTwo()
                      (1): ClampMin()
                    )
                  )
                )
              )
            )
          )
        )
        (13): Dropout(p=0.2)
        (14): QuantLinear(
          in_features=1024, out_features=10, bias=False
          (weight_reg): WeightReg()
          (weight_quant): WeightQuantProxy(
            (tensor_quant): BinaryQuant(
              (scaling_impl): StandaloneScaling(
                (restrict_value): RestrictValue(
                  (forward_impl): Sequential(
                    (0): PowerOfTwo()
                    (1): Identity()
                  )
                )
              )
            )
          )
          (bias_quant): BiasQuantProxy()
        )
        (15): BatchNorm1d(10, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )

 %% Cell type:markdown id: tags:

 We have now instantiated our trained PyTorch network. Let's try to run an example MNIST image through the network using PyTorch.

 %% Cell type:code id: tags:

 ``` python
 import matplotlib.pyplot as plt
 from pkgutil import get_data
 import onnx
 import onnx.numpy_helper as nph
 raw_i = get_data("finn", "data/onnx/mnist-conv/test_data_set_0/input_0.pb")
 input_tensor = onnx.load_tensor_from_string(raw_i)
 input_tensor_npy = nph.to_array(input_tensor)
 input_tensor_pyt = torch.from_numpy(input_tensor_npy).float()
 imgplot = plt.imshow(input_tensor_npy.reshape(28,28), cmap='gray')
 ```

 %% Output



 %% Cell type:code id: tags:

 ``` python
 from torch.nn.functional import softmax
 # do forward pass in PyTorch/Brevitas
 produced = lfc.forward(input_tensor_pyt).detach()
 probabilities = softmax(produced, dim=-1).flatten()
 probabilities
 ```

 %% Output

    tensor([0.0602, 0.0147, 0.5844, 0.0445, 0.0270, 0.0185, 0.0595, 0.0082, 0.1689,
            0.0141])

 %% Cell type:code id: tags:

 ``` python
 import numpy as np
 objects = [str(x) for x in range(10)]
 y_pos = np.arange(len(objects))
 plt.bar(y_pos, probabilities, align='center', alpha=0.5)
 plt.xticks(y_pos, objects)
 plt.ylabel('Predicted Probability')
 plt.title('LFC-w1a1 Predictions for Image')
 plt.show()
 ```

 %% Output



 %% Cell type:markdown id: tags:

 ## 2. Call Brevitas FINN-ONNX export and visualize with Netron

 Brevitas comes with built-in FINN-ONNX export functionality. This is similar to the regular ONNX export capabilities of PyTorch, with a few differences:

 1. The weight quantization logic is not exported as part of the graph; rather, the quantized weights themselves are exported.
 2. Special quantization annotations are used to preserve the low-bit quantization information. ONNX (at the time of writing) supports 8-bit quantization as the minimum bitwidth, whereas FINN-ONNX quantization annotations can go down to binary/bipolar quantization.
 3. Low-bit quantized activation functions are exported as MultiThreshold operators.

 It's actually quite straightforward to export ONNX from our Brevitas model as follows:

 %% Cell type:code id: tags:

 ``` python
 import brevitas.onnx as bo
 export_onnx_path = "/tmp/LFCW1A1.onnx"
 input_shape = (1, 1, 28, 28)
 bo.export_finn_onnx(lfc, input_shape, export_onnx_path)
 ```

 %% Output

    /workspace/brevitas_cnv_lfc/training_scripts/models/LFC.py:85: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
      x = 2.0 * x - torch.tensor([1.0]).to(self.device)

 %% Cell type:markdown id: tags:

 Let's examine what the exported ONNX model looks like. For this, we will use the Netron visualizer:

 %% Cell type:code id: tags:

 ``` python
 showInNetron('/tmp/LFCW1A1.onnx')
 ```

 %% Output

    Serving '/tmp/LFCW1A1.onnx' at http://0.0.0.0:8081

-    <IPython.lib.display.IFrame at 0x7f86cdb6e5f8>
+    <IPython.lib.display.IFrame at 0x7f3d330b6ac8>

 %% Cell type:markdown id: tags:

 When running this notebook in the FINN Docker container, you should be able to see an interactive visualization of the imported network above, and click on individual nodes to inspect their parameters. If you look at any of the MatMul nodes, you should be able to see that the weights are all {-1, +1} values, and the activations are Sign functions.

 %% Cell type:markdown id: tags:

 ## 3. Import into FINN and call cleanup transformations

 We will now import this ONNX model into FINN using the ModelWrapper, and examine some of the graph attributes from Python.

 %% Cell type:code id: tags:

 ``` python
 from finn.core.modelwrapper import ModelWrapper
 model = ModelWrapper(export_onnx_path)
 model.graph.node[9]
 ```

 %% Output

    input: "37"
    input: "38"
    output: "40"
    op_type: "MatMul"

 %% Cell type:markdown id: tags:

 The ModelWrapper exposes a range of other useful functions as well. For instance, by convention the second input of the MatMul node will be a pre-initialized weight tensor, which we can view using the following:

 %% Cell type:code id: tags:

 ``` python
 model.get_initializer(model.graph.node[9].input[1])
 ```

 %% Output

    array([[-1., -1., -1., ..., -1., -1.,  1.],
           [-1.,  1., -1., ...,  1., -1., -1.],
           [ 1., -1.,  1., ..., -1., -1., -1.],
           ...,
           [ 1.,  1., -1., ...,  1.,  1.,  1.],
           [-1., -1.,  1., ...,  1.,  1., -1.],
           [ 1.,  1., -1., ...,  1., -1., -1.]], dtype=float32)

 %% Cell type:markdown id: tags:

 We can also examine the quantization annotations and shapes of various tensors using the convenience functions provided by ModelWrapper.

 %% Cell type:code id: tags:

 ``` python
 model.get_tensor_datatype(model.graph.node[9].input[1])
 ```

 %% Output

    <DataType.BIPOLAR: 8>

 %% Cell type:code id: tags:

 ``` python
 model.get_tensor_shape(model.graph.node[9].input[1])
 ```

 %% Output

    [784, 1024]

 %% Cell type:markdown id: tags:

 If we want to operate further on this model in FINN, it is a good idea to execute certain "cleanup" transformations on this graph. Here, we will run shape inference and constant folding on this graph, and visualize the resulting graph in Netron again.

 %% Cell type:code id: tags:

 ``` python
 from finn.transformation.fold_constants import FoldConstants
 from finn.transformation.infer_shapes import InferShapes
 model = model.transform(InferShapes())
 model = model.transform(FoldConstants())
 export_onnx_path_transformed = "/tmp/LFCW1A1-clean.onnx"
 model.save(export_onnx_path_transformed)
 ```

 %% Cell type:code id: tags:

 ``` python
 showInNetron('/tmp/LFCW1A1-clean.onnx')
 ```

 %% Output

    
    Stopping http://0.0.0.0:8081
    Serving '/tmp/LFCW1A1-clean.onnx' at http://0.0.0.0:8081

-    <IPython.lib.display.IFrame at 0x7f86cdb6ec18>
+    <IPython.lib.display.IFrame at 0x7f3d3380aef0>

 %% Cell type:markdown id: tags:

 We can see that the resulting graph has become smaller and simpler. Specifically, the input reshaping is now a single Reshape node instead of the Shape -> Gather -> Unsqueeze -> Concat -> Reshape sequence. We can now use the internal ONNX execution capabilities of FINN to ensure that we still get the same output from this model as we did with PyTorch.

 %% Cell type:code id: tags:

 ``` python
 import finn.core.onnx_exec as oxe
 input_dict = {"0": nph.to_array(input_tensor)}
 output_dict = oxe.execute_onnx(model, input_dict)
 produced_finn = output_dict[list(output_dict.keys())[0]]

 produced_finn
 ```

 %% Output

    array([[-1.5095654 , -2.915617  ,  0.764004  , -1.8118242 , -2.308991  ,
            -2.6900144 , -1.520713  , -3.4965858 , -0.47711682, -2.9628415 ]],
          dtype=float32)

 %% Cell type:code id: tags:

 ``` python
 np.isclose(produced, produced_finn).all()
 ```

 %% Output

    True

 %% Cell type:markdown id: tags:

 We have succesfully verified that the transformed and cleaned-up FINN graph still produces the same output, and can now use this model for further processing in FINN.