diff --git a/docs/finn/example_networks.rst b/docs/finn/example_networks.rst
index 0f919f6df8d3bbaea8e5dc266095322567aa1401..47c9a976cb14a3e175dff6800ad8a5da60b44ecb 100644
--- a/docs/finn/example_networks.rst
+++ b/docs/finn/example_networks.rst
@@ -28,7 +28,7 @@ version, this is indicated by an x mark in the table.
 +-----------------------+------------+----------+----------+----------+----------+----------+----------+
 | Hardware test         | x          | x        | x        |          | x        |          |          |
 +-----------------------+------------+----------+----------+----------+----------+----------+----------+
-| npysim                | x          | x        | x        | x        | x        |          |          |
+| cppsim                | x          | x        | x        | x        | x        |          |          |
 +-----------------------+------------+----------+----------+----------+----------+----------+----------+
 | rtlsim node-by-node   | x          | x        | x        | x        | x        |          |          |
 +-----------------------+------------+----------+----------+----------+----------+----------+----------+
diff --git a/docs/finn/nw_prep.rst b/docs/finn/nw_prep.rst
index 96dfcc184261b5e24a4e7c51eef74fff55d0d841..2ccbb8d0ff65c4d8b1476e38002cd52dc0e4fdf2 100644
--- a/docs/finn/nw_prep.rst
+++ b/docs/finn/nw_prep.rst
@@ -12,7 +12,7 @@ Network Preparation
 
 The main principle of FINN are analysis and transformation passes. If you like to have more information about these please have a look at section :ref:`analysis_pass` and :ref:`transformation_pass` or at chapter :ref:`tutorials` about the provided Jupyter notebooks.
 
-This page is about the network preparation, the flow step that comes after the :ref:`brevitas_export`. Its main idea is to optimize the network and convert the nodes to custom nodes that correspond to `finn-hlslib <https://github.com/Xilinx/finn-hlslib>`_ functions. In this way we get a network that we can bring to hardware with the help of Vivado. For that we have to apply several transformations on the ONNX model, which this flow step receives wrapped in the :ref:`modelwrapper`. 
+This page is about the network preparation, the flow step that comes after the :ref:`brevitas_export`. Its main idea is to optimize the network and convert the nodes to custom nodes that correspond to `finn-hlslib <https://github.com/Xilinx/finn-hlslib>`_ functions. In this way we get a network that we can bring to hardware with the help of Vivado. For that we have to apply several transformations on the ONNX model, which this flow step receives wrapped in the :ref:`modelwrapper`.
 
 Various transformations are involved in the network preparation. The following is a short overview of these.
 
@@ -42,11 +42,11 @@ Pairs of binary XNORPopcountMatMul layers are converted to StreamingFCLayers and
 Dataflow Partitioning
 =====================
 
-In the next step the graph is split and the part consisting of HLS layers is further processed in the FINN flow. The parent graph containing the non-HLS layers remains. The PE and SIMD are set to 1 by default, so the result is a network of only HLS layers with maximum folding. The model can be verified using the *npysim* simulation. It is a simulation using C++ and is described in more detail in chapter :ref:`verification`.
+In the next step the graph is split and the part consisting of HLS layers is further processed in the FINN flow. The parent graph containing the non-HLS layers remains. The PE and SIMD are set to 1 by default, so the result is a network of only HLS layers with maximum folding. The model can be verified using the *cppsim* simulation. It is a simulation using C++ and is described in more detail in chapter :ref:`verification`.
 
 Folding
 =======
 
-To adjust the folding, the values for PE and SIMD can be increased to achieve also an increase in the performance. The result can be verified using the same simulation flow as for the network with maximum folding (*npysim* using C++), for details please have a look at chapter :ref:`verification`.
+To adjust the folding, the values for PE and SIMD can be increased to achieve also an increase in the performance. The result can be verified using the same simulation flow as for the network with maximum folding (*cppsim* using C++), for details please have a look at chapter :ref:`verification`.
 
 The result is a network of HLS layers with desired folding and it can be passed to :ref:`vivado_synth`.
diff --git a/docs/finn/source_code/finn.transformation.fpgadataflow.rst b/docs/finn/source_code/finn.transformation.fpgadataflow.rst
index 968587535408995cf5f6fcf8905cafe2cf897cd5..4f0fb3e0bc2af41f7237adc8dbde5ee251f4d94b 100644
--- a/docs/finn/source_code/finn.transformation.fpgadataflow.rst
+++ b/docs/finn/source_code/finn.transformation.fpgadataflow.rst
@@ -13,7 +13,7 @@ finn.transformation.fpgadataflow.cleanup
    :undoc-members:
    :show-inheritance:
 
-finn.transformation.fpgadataflow.codegen\_ipgen
+finn.transformation.fpgadataflow.prepare\_ip
 -----------------------------------------------
 
 .. automodule:: finn.transformation.fpgadataflow.prepare_ip
@@ -21,7 +21,7 @@ finn.transformation.fpgadataflow.codegen\_ipgen
    :undoc-members:
    :show-inheritance:
 
-finn.transformation.fpgadataflow.codegen\_ipstitch
+finn.transformation.fpgadataflow.create\_stitched\_ip
 --------------------------------------------------
 
 .. automodule:: finn.transformation.fpgadataflow.create_stitched_ip
@@ -29,7 +29,7 @@ finn.transformation.fpgadataflow.codegen\_ipstitch
    :undoc-members:
    :show-inheritance:
 
-finn.transformation.fpgadataflow.codegen\_npysim
+finn.transformation.fpgadataflow.prepare\_cppsim
 ------------------------------------------------
 
 .. automodule:: finn.transformation.fpgadataflow.prepare_cppsim
@@ -37,7 +37,7 @@ finn.transformation.fpgadataflow.codegen\_npysim
    :undoc-members:
    :show-inheritance:
 
-finn.transformation.fpgadataflow.compile_cppsim
+finn.transformation.fpgadataflow.compile\_cppsim
 ----------------------------------------
 
 .. automodule:: finn.transformation.fpgadataflow.compile_cppsim
@@ -61,7 +61,7 @@ finn.transformation.fpgadataflow.create\_dataflow\_partition
    :undoc-members:
    :show-inheritance:
 
-finn.transformation.fpgadataflow.hlssynth\_ipgen
+finn.transformation.fpgadataflow.hlssynth\_ip
 ------------------------------------------------
 
 .. automodule:: finn.transformation.fpgadataflow.hlssynth_ip
diff --git a/notebooks/advanced/1_custom_transformation_pass.ipynb b/notebooks/advanced/1_custom_transformation_pass.ipynb
index 7ff850e7b79bf1a8f2206d5f0fbab4cac5767f10..d072c9a2264c83614ef034050a7973fcd48aeef2 100644
--- a/notebooks/advanced/1_custom_transformation_pass.ipynb
+++ b/notebooks/advanced/1_custom_transformation_pass.ipynb
@@ -378,11 +378,11 @@
      "output_type": "stream",
      "text": [
       "class CompileCppSim(NodeLocalTransformation):\n",
-      "    \"\"\"For every node: compile C++ code in node attribute \"code_gen_dir_npysim\"\n",
+      "    \"\"\"For every node: compile C++ code in node attribute \"code_gen_dir_cppsim\"\n",
       "    and save path to executables in node attribute \"executable_path\".\n",
       "    All nodes in the graph must have the fpgadataflow backend attribute.\n",
       "\n",
-      "    To use these executables, exec_mode must be set to \"npysim\" (using transformation\n",
+      "    To use these executables, exec_mode must be set to \"cppsim\" (using transformation\n",
       "    SetExecMode) and the model has to be executed using execute_onnx() from\n",
       "    finn.core.onnx_exec\n",
       "\n",
@@ -401,9 +401,9 @@
       "                inst = registry.custom_op[op_type](node)\n",
       "                # ensure that code is generated\n",
       "                assert (\n",
-      "                    inst.get_nodeattr(\"code_gen_dir_npysim\") != \"\"\n",
+      "                    inst.get_nodeattr(\"code_gen_dir_cppsim\") != \"\"\n",
       "                ), \"\"\"Node\n",
-      "                attribute \"code_gen_dir_npysim\" is not set. Please run\n",
+      "                attribute \"code_gen_dir_cppsim\" is not set. Please run\n",
       "                Transformation PrepareCppSim first.\"\"\"\n",
       "                # call the compilation function for this node\n",
       "                inst.compile_singlenode_code()\n",
diff --git a/notebooks/end2end_example/tfc_end2end_example.ipynb b/notebooks/end2end_example/tfc_end2end_example.ipynb
index bcb54adf2357f71fe110530baf9ded87637d488f..41ef392b40fe94579966b954e2e0c51b0e32418b 100644
--- a/notebooks/end2end_example/tfc_end2end_example.ipynb
+++ b/notebooks/end2end_example/tfc_end2end_example.ipynb
@@ -747,7 +747,7 @@
        " 'mem_mode': ('s', False, 'const'),\n",
        " 'ram_style': ('s', False, 'auto'),\n",
        " 'backend': ('s', True, 'fpgadataflow'),\n",
-       " 'code_gen_dir_npysim': ('s', False, ''),\n",
+       " 'code_gen_dir_cppsim': ('s', False, ''),\n",
        " 'code_gen_dir_ipgen': ('s', False, ''),\n",
        " 'executable_path': ('s', False, ''),\n",
        " 'ipgen_path': ('s', False, ''),\n",
diff --git a/notebooks/end2end_example/tfc_end2end_verification.ipynb b/notebooks/end2end_example/tfc_end2end_verification.ipynb
index 76950adea663d3c301ba7ae568534ebdd4eda465..ac6f95cfa9488be60791f6e4fce7cd680d3d8736 100644
--- a/notebooks/end2end_example/tfc_end2end_verification.ipynb
+++ b/notebooks/end2end_example/tfc_end2end_verification.ipynb
@@ -200,7 +200,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Simulation (npysim) using C++\n",
+    "## Simulation (cppsim) using C++\n",
     "\n",
     "When dealing with HLS custom op nodes in FINN the simulation using Python is no longer sufficient. After the nodes have been converted to HLS layers, the simulation using C++ can be used. To do this, the input tensor is stored in an .npy file and C++ code is generated that reads the values from the .npy array, streams them to the corresponding finn-hlslib function and writes the result to a new .npy file. This in turn can be read in Python and processed in the FINN flow. For this example the model after the conversion to HLS layers is used."
    ]
@@ -211,7 +211,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "model_for_npysim = ModelWrapper(build_dir+\"/tfc_w1_a1_hls_layers.onnx\")"
+    "model_for_cppsim = ModelWrapper(build_dir+\"/tfc_w1_a1_hls_layers.onnx\")"
    ]
   },
   {
@@ -232,8 +232,8 @@
     "from finn.transformation.fpgadataflow.prepare_cppsim import PrepareCppSim\n",
     "from finn.transformation.fpgadataflow.compile_cppsim import CompileCppSim\n",
     "\n",
-    "model_for_npysim = model_for_npysim.transform(PrepareCppSim())\n",
-    "model_for_npysim = model_for_npysim.transform(CompileCppSim())"
+    "model_for_cppsim = model_for_cppsim.transform(PrepareCppSim())\n",
+    "model_for_cppsim = model_for_cppsim.transform(CompileCppSim())"
    ]
   },
   {
@@ -252,7 +252,7 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "Serving '/workspace/finn/tfc_w1_a1_for_npysim.onnx' at http://0.0.0.0:8081\n"
+      "Serving '/workspace/finn/tfc_w1_a1_for_cppsim.onnx' at http://0.0.0.0:8081\n"
      ]
     },
     {
@@ -278,8 +278,8 @@
     }
    ],
    "source": [
-    "model_for_npysim.save(build_dir+\"/tfc_w1_a1_for_npysim.onnx\")\n",
-    "showInNetron(build_dir+\"/tfc_w1_a1_for_npysim.onnx\")"
+    "model_for_cppsim.save(build_dir+\"/tfc_w1_a1_for_cppsim.onnx\")\n",
+    "showInNetron(build_dir+\"/tfc_w1_a1_for_cppsim.onnx\")"
    ]
   },
   {
@@ -287,7 +287,7 @@
    "metadata": {},
    "source": [
     "The following node attributes have been added:\n",
-    "* `code_gen_dir_npysim` indicates the directory where the files for the simulation using C++ are stored\n",
+    "* `code_gen_dir_cppsim` indicates the directory where the files for the simulation using C++ are stored\n",
     "* `executable_path` specifies the path to the executable\n",
     "\n",
     "We take now a closer look into the files that were generated:"
@@ -309,9 +309,9 @@
    "source": [
     "from finn.custom_op.registry import getCustomOp\n",
     "\n",
-    "fc0 = model_for_npysim.graph.node[2]\n",
+    "fc0 = model_for_cppsim.graph.node[2]\n",
     "fc0w = getCustomOp(fc0)\n",
-    "code_gen_dir = fc0w.get_nodeattr(\"code_gen_dir_npysim\")\n",
+    "code_gen_dir = fc0w.get_nodeattr(\"code_gen_dir_cppsim\")\n",
     "!ls {code_gen_dir}"
    ]
   },
@@ -326,7 +326,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "To simulate the model the execution mode(exec_mode) must be set to \"npysim\". This is done using the transformation SetExecMode."
+    "To simulate the model the execution mode(exec_mode) must be set to \"cppsim\". This is done using the transformation SetExecMode."
    ]
   },
   {
@@ -337,7 +337,7 @@
    "source": [
     "from finn.transformation.fpgadataflow.set_exec_mode import SetExecMode\n",
     "\n",
-    "model_for_npysim = model_for_npysim.transform(SetExecMode(\"npysim\"))"
+    "model_for_cppsim = model_for_cppsim.transform(SetExecMode(\"cppsim\"))"
    ]
   },
   {
@@ -363,10 +363,10 @@
     }
    ],
    "source": [
-    "output_dict = oxe.execute_onnx(model_for_npysim, input_dict)\n",
-    "output_npysim = output_dict[list(output_dict.keys())[0]]\n",
+    "output_dict = oxe.execute_onnx(model_for_cppsim, input_dict)\n",
+    "output_cppsim = output_dict[list(output_dict.keys())[0]]\n",
     "\n",
-    "if np.isclose(output_npysim, output_golden, atol=1e-3).all():\n",
+    "if np.isclose(output_cppsim, output_golden, atol=1e-3).all():\n",
     "    print(\"Results are the same!\")\n",
     "else:\n",
     "    print(\"The results are not the same!\")"
diff --git a/src/finn/custom_op/fpgadataflow/__init__.py b/src/finn/custom_op/fpgadataflow/__init__.py
index 65ad469ca13fd3bead01110c540b27015ab538a9..b3e30a07a96a5590fdb755766c235d2ba99f4caf 100644
--- a/src/finn/custom_op/fpgadataflow/__init__.py
+++ b/src/finn/custom_op/fpgadataflow/__init__.py
@@ -74,7 +74,7 @@ class HLSCustomOp(CustomOp):
     def get_nodeattr_types(self):
         return {
             "backend": ("s", True, "fpgadataflow"),
-            "code_gen_dir_npysim": ("s", False, ""),
+            "code_gen_dir_cppsim": ("s", False, ""),
             "code_gen_dir_ipgen": ("s", False, ""),
             "executable_path": ("s", False, ""),
             "ipgen_path": ("s", False, ""),
@@ -232,14 +232,14 @@ class HLSCustomOp(CustomOp):
         vlnv = "xilinx.com:hls:%s:1.0" % node.name
         self.set_nodeattr("ip_vlnv", vlnv)
 
-    def code_generation_npysim(self, model):
-        """Generates c++ code for simulation (npysim)."""
+    def code_generation_cppsim(self, model):
+        """Generates c++ code for simulation (cppsim)."""
         node = self.onnx_node
-        path = self.get_nodeattr("code_gen_dir_npysim")
+        path = self.get_nodeattr("code_gen_dir_cppsim")
         self.code_gen_dict["$AP_INT_MAX_W$"] = [str(self.get_ap_int_max_w())]
         self.generate_params(model, path)
         self.global_includes()
-        self.defines("npysim")
+        self.defines("cppsim")
         self.read_npy_data()
         self.strm_decl()
         self.pragmas()
@@ -253,7 +253,7 @@ class HLSCustomOp(CustomOp):
             # transform list into long string separated by '\n'
             code_gen_line = "\n".join(self.code_gen_dict[key])
             template = template.replace(key, code_gen_line)
-        code_gen_dir = self.get_nodeattr("code_gen_dir_npysim")
+        code_gen_dir = self.get_nodeattr("code_gen_dir_cppsim")
         f = open(os.path.join(code_gen_dir, "execute_{}.cpp".format(node.op_type)), "w")
         f.write(template)
         f.close()
@@ -262,7 +262,7 @@ class HLSCustomOp(CustomOp):
     def compile_singlenode_code(self):
         """Builds the bash script for compilation using the CppBuilder from
         finn.util.basic and executes the script to produce the executable."""
-        code_gen_dir = self.get_nodeattr("code_gen_dir_npysim")
+        code_gen_dir = self.get_nodeattr("code_gen_dir_cppsim")
         builder = CppBuilder()
         # to enable additional debug features please uncommand the next line
         # builder.append_includes("-DDEBUG")
@@ -284,7 +284,7 @@ class HLSCustomOp(CustomOp):
 
         Count indicates the number of inputs that have to be saved."""
         node = self.onnx_node
-        code_gen_dir = self.get_nodeattr("code_gen_dir_npysim")
+        code_gen_dir = self.get_nodeattr("code_gen_dir_cppsim")
         if code_gen_dir == "":
             raise Exception(
                 """
@@ -306,7 +306,7 @@ Found no codegen dir for this node, did you run the prepare_cppsim transformatio
         the context dictionary."""
         # TODO support multi-output nodes as needed
         node = self.onnx_node
-        code_gen_dir = self.get_nodeattr("code_gen_dir_npysim")
+        code_gen_dir = self.get_nodeattr("code_gen_dir_cppsim")
         output = np.load("{}/output.npy".format(code_gen_dir))
         context[node.output[0]] = output
 
@@ -399,9 +399,9 @@ compilation transformations?
         return outputs
 
     def execute_node(self, context, graph):
-        """Executes single node using npysim or rtlsim."""
+        """Executes single node using cppsim or rtlsim."""
         mode = self.get_nodeattr("exec_mode")
-        if mode == "npysim":
+        if mode == "cppsim":
             # save input(s)
             self.dynamic_input_to_npy(context, 1)
             # execute the precompiled model
@@ -414,7 +414,7 @@ compilation transformations?
         else:
             raise Exception(
                 """Invalid value for attribute exec_mode! Is currently set to: {}
-            has to be set to one of the following value ("npysim", "rtlsim")""".format(
+            has to be set to one of the following value ("cppsim", "rtlsim")""".format(
                     mode
                 )
             )
@@ -435,14 +435,14 @@ compilation transformations?
     @abstractmethod
     def global_includes(self):
         """Function to set the global includes for c++ code that has to be generated
-        for npysim or rtlsim, is member function of HLSCustomOp class but has to
+        for cppsim or rtlsim, is member function of HLSCustomOp class but has to
         be filled by every node."""
         pass
 
     @abstractmethod
     def defines(self, var):
         """Function to set the define commands for c++ code that has to be generated
-        for npysim or rtlsim, is member function of HLSCustomOp class but has to
+        for cppsim or rtlsim, is member function of HLSCustomOp class but has to
         be filled by every node.
 
         var: makes it possible to reuse the function for different c++ code generation.
diff --git a/src/finn/custom_op/fpgadataflow/convolutioninputgenerator.py b/src/finn/custom_op/fpgadataflow/convolutioninputgenerator.py
index 2b469f7b0d6e5ddc3068fa3fd2d6cb487a560d92..e4d106068d4d128c66b2ce5f3d6c925dfe414b90 100644
--- a/src/finn/custom_op/fpgadataflow/convolutioninputgenerator.py
+++ b/src/finn/custom_op/fpgadataflow/convolutioninputgenerator.py
@@ -177,14 +177,14 @@ class ConvolutionInputGenerator(HLSCustomOp):
         folded_oshape = self.get_folded_output_shape()
 
         # TODO ensure codegen dir exists
-        if mode == "npysim":
-            code_gen_dir = self.get_nodeattr("code_gen_dir_npysim")
+        if mode == "cppsim":
+            code_gen_dir = self.get_nodeattr("code_gen_dir_cppsim")
         elif mode == "rtlsim":
             code_gen_dir = self.get_nodeattr("code_gen_dir_ipgen")
         else:
             raise Exception(
                 """Invalid value for attribute exec_mode! Is currently set to: {}
-            has to be set to one of the following value ("npysim", "rtlsim")""".format(
+            has to be set to one of the following value ("cppsim", "rtlsim")""".format(
                     mode
                 )
             )
@@ -207,14 +207,14 @@ class ConvolutionInputGenerator(HLSCustomOp):
         reshaped_input = inp.copy()
         np.save(os.path.join(code_gen_dir, "input_0.npy"), reshaped_input)
 
-        if mode == "npysim":
+        if mode == "cppsim":
             # execute the precompiled model
             super().exec_precompiled_singlenode_model()
             # load output npy file
             super().npy_to_dynamic_output(context)
             assert (
                 context[node.output[0]].shape == folded_oshape
-            ), "npysim \
+            ), "cppsim \
             did not produce expected ofolded utput shape"
             context[node.output[0]] = context[node.output[0]].reshape(*exp_oshape)
         elif mode == "rtlsim":
@@ -241,7 +241,7 @@ class ConvolutionInputGenerator(HLSCustomOp):
         else:
             raise Exception(
                 """Invalid value for attribute exec_mode! Is currently set to: {}
-            has to be set to one of the following value ("npysim", "rtlsim")""".format(
+            has to be set to one of the following value ("cppsim", "rtlsim")""".format(
                     mode
                 )
             )
@@ -277,7 +277,7 @@ class ConvolutionInputGenerator(HLSCustomOp):
         ]
 
     def read_npy_data(self):
-        code_gen_dir = self.get_nodeattr("code_gen_dir_npysim")
+        code_gen_dir = self.get_nodeattr("code_gen_dir_cppsim")
         dtype = self.get_input_datatype()
         if dtype == DataType.BIPOLAR:
             # use binary for bipolar storage
@@ -313,7 +313,7 @@ class ConvolutionInputGenerator(HLSCustomOp):
         ]
 
     def dataoutstrm(self):
-        code_gen_dir = self.get_nodeattr("code_gen_dir_npysim")
+        code_gen_dir = self.get_nodeattr("code_gen_dir_cppsim")
         dtype = self.get_output_datatype()
         if dtype == DataType.BIPOLAR:
             # use binary for bipolar storage
diff --git a/src/finn/custom_op/fpgadataflow/streamingdatawidthconverter_batch.py b/src/finn/custom_op/fpgadataflow/streamingdatawidthconverter_batch.py
index f30871909b1c70f3b5df148f1b6eae22fdbadc25..1ca2c6d29313eb9d978a6ac0454b9226802f55a5 100644
--- a/src/finn/custom_op/fpgadataflow/streamingdatawidthconverter_batch.py
+++ b/src/finn/custom_op/fpgadataflow/streamingdatawidthconverter_batch.py
@@ -226,7 +226,7 @@ class StreamingDataWidthConverter_Batch(HLSCustomOp):
         ]
 
     def read_npy_data(self):
-        code_gen_dir = self.get_nodeattr("code_gen_dir_npysim")
+        code_gen_dir = self.get_nodeattr("code_gen_dir_cppsim")
         dtype = self.get_input_datatype()
         if dtype == DataType.BIPOLAR:
             # use binary for bipolar storage
@@ -260,7 +260,7 @@ class StreamingDataWidthConverter_Batch(HLSCustomOp):
         ]
 
     def dataoutstrm(self):
-        code_gen_dir = self.get_nodeattr("code_gen_dir_npysim")
+        code_gen_dir = self.get_nodeattr("code_gen_dir_cppsim")
         dtype = self.get_output_datatype()
         if dtype == DataType.BIPOLAR:
             # use binary for bipolar storage
@@ -313,14 +313,14 @@ class StreamingDataWidthConverter_Batch(HLSCustomOp):
         folded_ishape = self.get_folded_input_shape()
 
         # TODO ensure codegen dir exists
-        if mode == "npysim":
-            code_gen_dir = self.get_nodeattr("code_gen_dir_npysim")
+        if mode == "cppsim":
+            code_gen_dir = self.get_nodeattr("code_gen_dir_cppsim")
         elif mode == "rtlsim":
             code_gen_dir = self.get_nodeattr("code_gen_dir_ipgen")
         else:
             raise Exception(
                 """Invalid value for attribute exec_mode! Is currently set to: {}
-            has to be set to one of the following value ("npysim", "rtlsim")""".format(
+            has to be set to one of the following value ("cppsim", "rtlsim")""".format(
                     mode
                 )
             )
@@ -343,7 +343,7 @@ class StreamingDataWidthConverter_Batch(HLSCustomOp):
         reshaped_input = reshaped_input.copy()
         np.save(os.path.join(code_gen_dir, "input_0.npy"), reshaped_input)
 
-        if mode == "npysim":
+        if mode == "cppsim":
             output = inp
             output = np.asarray([output], dtype=np.float32).reshape(*exp_shape)
             context[node.output[0]] = output
diff --git a/src/finn/custom_op/fpgadataflow/streamingfclayer_batch.py b/src/finn/custom_op/fpgadataflow/streamingfclayer_batch.py
index 46920711e13057178be9fca5fe3a18ce3e14feda..3757e3a5f1f29a1d6c88ccc73ce3f3715611cbc0 100644
--- a/src/finn/custom_op/fpgadataflow/streamingfclayer_batch.py
+++ b/src/finn/custom_op/fpgadataflow/streamingfclayer_batch.py
@@ -181,7 +181,7 @@ class StreamingFCLayer_Batch(HLSCustomOp):
         # verify that all necessary attributes exist
         # TODO collect automatically from get_nodeattr_types
         try:
-            self.get_nodeattr("code_gen_dir_npysim")
+            self.get_nodeattr("code_gen_dir_cppsim")
             self.get_nodeattr("executable_path")
             self.get_nodeattr("resType")
             self.get_nodeattr("MW")
@@ -508,10 +508,10 @@ class StreamingFCLayer_Batch(HLSCustomOp):
             f_weights.close()
 
         elif mem_mode == "decoupled":
-            """Saves weights in corresponding file format for npysim or rtlsim"""
+            """Saves weights in corresponding file format for cppsim or rtlsim"""
             # transpose weight tensor from (1, PE, WMEM, SIMD) to (1, WMEM, PE, SIMD)
             # and save as unflipped weight tensor to be able to differentiate between
-            # flipped an unflipped weight tensor (has to be flipped for npysim)
+            # flipped an unflipped weight tensor (has to be flipped for cppsim)
 
             weight_tensor_unflipped = np.transpose(weight_tensor, (0, 2, 1, 3))
 
@@ -613,14 +613,14 @@ class StreamingFCLayer_Batch(HLSCustomOp):
         node = self.onnx_node
 
         # TODO ensure codegen dir exists
-        if mode == "npysim":
-            code_gen_dir = self.get_nodeattr("code_gen_dir_npysim")
+        if mode == "cppsim":
+            code_gen_dir = self.get_nodeattr("code_gen_dir_cppsim")
         elif mode == "rtlsim":
             code_gen_dir = self.get_nodeattr("code_gen_dir_ipgen")
         else:
             raise Exception(
                 """Invalid value for attribute exec_mode! Is currently set to: {}
-            has to be set to one of the following value ("npysim", "rtlsim")""".format(
+            has to be set to one of the following value ("cppsim", "rtlsim")""".format(
                     mode
                 )
             )
@@ -654,7 +654,7 @@ class StreamingFCLayer_Batch(HLSCustomOp):
                 raise Exception("Unexpected input found for StreamingFCLayer")
             in_ind += 1
 
-        if mode == "npysim":
+        if mode == "cppsim":
             # execute the precompiled model
             super().exec_precompiled_singlenode_model()
             # load output npy file
@@ -696,7 +696,7 @@ class StreamingFCLayer_Batch(HLSCustomOp):
         else:
             raise Exception(
                 """Invalid value for attribute exec_mode! Is currently set to: {}
-            has to be set to one of the following value ("npysim", "rtlsim")""".format(
+            has to be set to one of the following value ("cppsim", "rtlsim")""".format(
                     mode
                 )
             )
@@ -744,7 +744,7 @@ class StreamingFCLayer_Batch(HLSCustomOp):
             )
 
     def read_npy_data(self):
-        code_gen_dir = self.get_nodeattr("code_gen_dir_npysim")
+        code_gen_dir = self.get_nodeattr("code_gen_dir_cppsim")
         dtype = self.get_input_datatype()
         if dtype == DataType.BIPOLAR:
             # use binary for bipolar storage
@@ -841,7 +841,7 @@ class StreamingFCLayer_Batch(HLSCustomOp):
             )
 
     def dataoutstrm(self):
-        code_gen_dir = self.get_nodeattr("code_gen_dir_npysim")
+        code_gen_dir = self.get_nodeattr("code_gen_dir_cppsim")
         dtype = self.get_output_datatype()
         if dtype == DataType.BIPOLAR:
             # use binary for bipolar storage
diff --git a/src/finn/custom_op/fpgadataflow/streamingfifo.py b/src/finn/custom_op/fpgadataflow/streamingfifo.py
index 0a7f143d26fd98c91a34ffcbe5f8fecabc677182..586d38a03f3717d1ea2cffcf7474ca434c9ea505 100644
--- a/src/finn/custom_op/fpgadataflow/streamingfifo.py
+++ b/src/finn/custom_op/fpgadataflow/streamingfifo.py
@@ -121,7 +121,7 @@ class StreamingFIFO(HLSCustomOp):
             # transform list into long string separated by '\n'
             code_gen_line = "\n".join(self.code_gen_dict[key])
             template = template.replace(key, code_gen_line)
-        f = open(os.path.join(verilog_dir, "{}.v".format(self.onnx_node.name,)), "w",)
+        f = open(os.path.join(verilog_dir, "{}.v".format(self.onnx_node.name)), "w")
         f.write(template)
         f.close()
         self.code_gen_dict.clear()
@@ -222,7 +222,7 @@ class StreamingFIFO(HLSCustomOp):
         inp = context[node.input[0]]
         exp_shape = self.get_normal_input_shape()
 
-        if mode == "npysim":
+        if mode == "cppsim":
             output = inp
             output = np.asarray([output], dtype=np.float32).reshape(*exp_shape)
             context[node.output[0]] = output
@@ -243,9 +243,7 @@ class StreamingFIFO(HLSCustomOp):
                 export_idt = DataType[self.get_nodeattr("dataType")]
             # make copy before saving the array
             reshaped_input = reshaped_input.copy()
-            np.save(
-                os.path.join(code_gen_dir, "input_0.npy"), reshaped_input,
-            )
+            np.save(os.path.join(code_gen_dir, "input_0.npy"), reshaped_input)
             sim = self.get_rtlsim()
             nbits = self.get_instream_width()
             inp = npy_to_rtlsim_input(
@@ -271,7 +269,7 @@ class StreamingFIFO(HLSCustomOp):
         else:
             raise Exception(
                 """Invalid value for attribute exec_mode! Is currently set to: {}
-            has to be set to one of the following value ("npysim", "rtlsim")""".format(
+            has to be set to one of the following value ("cppsim", "rtlsim")""".format(
                     mode
                 )
             )
diff --git a/src/finn/custom_op/fpgadataflow/streamingmaxpool_batch.py b/src/finn/custom_op/fpgadataflow/streamingmaxpool_batch.py
index 7334c913b6f85cad4835b6e65eb14c488432af6b..2344e12f7e87634c189563f9cde7b1c861a3606e 100644
--- a/src/finn/custom_op/fpgadataflow/streamingmaxpool_batch.py
+++ b/src/finn/custom_op/fpgadataflow/streamingmaxpool_batch.py
@@ -171,7 +171,7 @@ class StreamingMaxPool_Batch(HLSCustomOp):
         ]
 
     def read_npy_data(self):
-        code_gen_dir = self.get_nodeattr("code_gen_dir_npysim")
+        code_gen_dir = self.get_nodeattr("code_gen_dir_cppsim")
         dtype = self.get_input_datatype()
         if dtype == DataType.BIPOLAR:
             # use binary for bipolar storage
@@ -215,7 +215,7 @@ class StreamingMaxPool_Batch(HLSCustomOp):
             ]
 
     def dataoutstrm(self):
-        code_gen_dir = self.get_nodeattr("code_gen_dir_npysim")
+        code_gen_dir = self.get_nodeattr("code_gen_dir_cppsim")
         dtype = self.get_output_datatype()
         if dtype == DataType.BIPOLAR:
             # use binary for bipolar storage
@@ -267,14 +267,14 @@ class StreamingMaxPool_Batch(HLSCustomOp):
         folded_oshape = self.get_folded_output_shape()
 
         # TODO ensure codegen dir exists
-        if mode == "npysim":
-            code_gen_dir = self.get_nodeattr("code_gen_dir_npysim")
+        if mode == "cppsim":
+            code_gen_dir = self.get_nodeattr("code_gen_dir_cppsim")
         elif mode == "rtlsim":
             code_gen_dir = self.get_nodeattr("code_gen_dir_ipgen")
         else:
             raise Exception(
                 """Invalid value for attribute exec_mode! Is currently set to: {}
-            has to be set to one of the following value ("npysim", "rtlsim")""".format(
+            has to be set to one of the following value ("cppsim", "rtlsim")""".format(
                     mode
                 )
             )
@@ -296,14 +296,14 @@ class StreamingMaxPool_Batch(HLSCustomOp):
         reshaped_input = inp.copy()
         np.save(os.path.join(code_gen_dir, "input_0.npy"), reshaped_input)
 
-        if mode == "npysim":
+        if mode == "cppsim":
             # execute the precompiled model
             super().exec_precompiled_singlenode_model()
             # load output npy file
             super().npy_to_dynamic_output(context)
             assert (
                 context[node.output[0]].shape == folded_oshape
-            ), "npysim \
+            ), "cppsim \
             did not produce expected ofolded utput shape"
             context[node.output[0]] = context[node.output[0]].reshape(*exp_oshape)
         elif mode == "rtlsim":
@@ -330,7 +330,7 @@ class StreamingMaxPool_Batch(HLSCustomOp):
         else:
             raise Exception(
                 """Invalid value for attribute exec_mode! Is currently set to: {}
-            has to be set to one of the following value ("npysim", "rtlsim")""".format(
+            has to be set to one of the following value ("cppsim", "rtlsim")""".format(
                     mode
                 )
             )
diff --git a/src/finn/transformation/fpgadataflow/cleanup.py b/src/finn/transformation/fpgadataflow/cleanup.py
index a31cbfa7dd30eff37ceb2d7bf3c162093a5a3a1c..248a99b57aed7f38f63cc25ad7ecf93bd1930e63 100644
--- a/src/finn/transformation/fpgadataflow/cleanup.py
+++ b/src/finn/transformation/fpgadataflow/cleanup.py
@@ -57,11 +57,11 @@ class CleanUp(Transformation):
                 try:
                     # lookup op_type in registry of CustomOps
                     inst = registry.custom_op[op_type](node)
-                    # delete code_gen_dir from npysim
-                    code_gen_dir = inst.get_nodeattr("code_gen_dir_npysim")
+                    # delete code_gen_dir from cppsim
+                    code_gen_dir = inst.get_nodeattr("code_gen_dir_cppsim")
                     if os.path.isdir(code_gen_dir):
                         shutil.rmtree(code_gen_dir)
-                    inst.set_nodeattr("code_gen_dir_npysim", "")
+                    inst.set_nodeattr("code_gen_dir_cppsim", "")
                     inst.set_nodeattr("executable_path", "")
                     # delete code_gen_dir from ipgen and project folder
                     code_gen_dir = inst.get_nodeattr("code_gen_dir_ipgen")
diff --git a/src/finn/transformation/fpgadataflow/compile_cppsim.py b/src/finn/transformation/fpgadataflow/compile_cppsim.py
index dc663b662396b0ca533b8d684b203c6abb4b6b4a..ddf00c799b8a53c428d0854551d0078a6e264111 100644
--- a/src/finn/transformation/fpgadataflow/compile_cppsim.py
+++ b/src/finn/transformation/fpgadataflow/compile_cppsim.py
@@ -32,11 +32,11 @@ from finn.transformation import NodeLocalTransformation
 
 
 class CompileCppSim(NodeLocalTransformation):
-    """For every node: compile C++ code in node attribute "code_gen_dir_npysim"
+    """For every node: compile C++ code in node attribute "code_gen_dir_cppsim"
     and save path to executables in node attribute "executable_path".
     All nodes in the graph must have the fpgadataflow backend attribute.
 
-    To use these executables, exec_mode must be set to "npysim" (using transformation
+    To use these executables, exec_mode must be set to "cppsim" (using transformation
     SetExecMode) and the model has to be executed using execute_onnx() from
     finn.core.onnx_exec
 
@@ -55,9 +55,9 @@ class CompileCppSim(NodeLocalTransformation):
                 inst = registry.custom_op[op_type](node)
                 # ensure that code is generated
                 assert (
-                    inst.get_nodeattr("code_gen_dir_npysim") != ""
+                    inst.get_nodeattr("code_gen_dir_cppsim") != ""
                 ), """Node
-                attribute "code_gen_dir_npysim" is not set. Please run
+                attribute "code_gen_dir_cppsim" is not set. Please run
                 Transformation PrepareCppSim first."""
                 # call the compilation function for this node
                 inst.compile_singlenode_code()
diff --git a/src/finn/transformation/fpgadataflow/prepare_cppsim.py b/src/finn/transformation/fpgadataflow/prepare_cppsim.py
index 5477d2fb7569b359e5b604e243d55a7d0ae24608..a1524322ec03a4e96ef41f999144e3eed349c5af 100644
--- a/src/finn/transformation/fpgadataflow/prepare_cppsim.py
+++ b/src/finn/transformation/fpgadataflow/prepare_cppsim.py
@@ -36,22 +36,22 @@ from finn.util.fpgadataflow import is_fpgadataflow_node
 
 def _codegen_single_node(node, model):
     """Calls C++ code generation for one node. Resulting code can be used
-    to simulate node using npysim."""
+    to simulate node using cppsim."""
 
     op_type = node.op_type
     try:
         # lookup op_type in registry of CustomOps
         inst = registry.custom_op[op_type](node)
         # get the path of the code generation directory
-        code_gen_dir = inst.get_nodeattr("code_gen_dir_npysim")
+        code_gen_dir = inst.get_nodeattr("code_gen_dir_cppsim")
         # ensure that there is a directory
         if code_gen_dir == "" or not os.path.isdir(code_gen_dir):
             code_gen_dir = make_build_dir(
-                prefix="code_gen_npysim_" + str(node.name) + "_"
+                prefix="code_gen_cppsim_" + str(node.name) + "_"
             )
-            inst.set_nodeattr("code_gen_dir_npysim", code_gen_dir)
+            inst.set_nodeattr("code_gen_dir_cppsim", code_gen_dir)
         # ensure that there is generated code inside the dir
-        inst.code_generation_npysim(model)
+        inst.code_generation_cppsim(model)
     except KeyError:
         # exception if op_type is not supported
         raise Exception("Custom op_type %s is currently not supported." % op_type)
@@ -62,8 +62,8 @@ class PrepareCppSim(Transformation):
     and create folder that contains all the generated files.
     All nodes in the graph must have the fpgadataflow backend attribute.
 
-    Outcome if succesful: Node attribute "code_gen_dir_npysim" contains path to folder
-    that contains generated C++ code that can be used to simulate node using npysim.
+    Outcome if succesful: Node attribute "code_gen_dir_cppsim" contains path to folder
+    that contains generated C++ code that can be used to simulate node using cppsim.
     The subsequent transformation is CompileCppSim"""
 
     def apply(self, model):
diff --git a/src/finn/transformation/fpgadataflow/set_exec_mode.py b/src/finn/transformation/fpgadataflow/set_exec_mode.py
index 83dda7ceccfd26fa1c43ab517ade2e19ccae4a61..40996e5f64fb812ea3766b71a9a8275514dec4a0 100644
--- a/src/finn/transformation/fpgadataflow/set_exec_mode.py
+++ b/src/finn/transformation/fpgadataflow/set_exec_mode.py
@@ -33,7 +33,7 @@ from finn.transformation import Transformation
 
 class SetExecMode(Transformation):
     """Set attribute exec_mode in all fpgadataflow nodes to specify which
-    kind of execution should be used ("npysim" or "rtlsim")"""
+    kind of execution should be used ("cppsim" or "rtlsim")"""
 
     def __init__(self, mode):
         super().__init__()
diff --git a/tests/end2end/test_end2end_cnv_w1a1.py b/tests/end2end/test_end2end_cnv_w1a1.py
index 94358ae48503e6a6facf50213634caff10f50a29..d7f59ef35aaf61891937dcaa105cf1392133e732 100644
--- a/tests/end2end/test_end2end_cnv_w1a1.py
+++ b/tests/end2end/test_end2end_cnv_w1a1.py
@@ -188,13 +188,13 @@ def test_end2end_cnv_w1a1_verify_dataflow_part():
     inp_name = model.graph.input[0].name
     out_name = model.graph.output[0].name
     inp_dict = {inp_name: x}
-    # npysim
+    # cppsim
     model = model.transform(PrepareCppSim())
     model = model.transform(CompileCppSim())
-    model = model.transform(SetExecMode("npysim"))
-    model.save(build_dir + "/end2end_cnv_w1a1_ipgen_npysim.onnx")
-    ret_npysim = execute_onnx(model, inp_dict, True)
-    res_npysim = ret_npysim[out_name]
+    model = model.transform(SetExecMode("cppsim"))
+    model.save(build_dir + "/end2end_cnv_w1a1_ipgen_cppsim.onnx")
+    ret_cppsim = execute_onnx(model, inp_dict, True)
+    res_cppsim = ret_cppsim[out_name]
     # node-by-node rtlsim
     model = model.transform(SetExecMode("rtlsim"))
     model = model.transform(PrepareRTLSim())
@@ -208,8 +208,8 @@ def test_end2end_cnv_w1a1_verify_dataflow_part():
     os.environ["LIVENESS_THRESHOLD"] = "-1"
     ret_rtlsim_whole = execute_onnx(model, inp_dict, True)
     res_rtlsim_whole = ret_rtlsim_whole[out_name]
-    assert np.isclose(res_npysim, res_rtlsim_nodebynode).all()
-    assert np.isclose(res_npysim, res_rtlsim_whole).all()
+    assert np.isclose(res_cppsim, res_rtlsim_nodebynode).all()
+    assert np.isclose(res_cppsim, res_rtlsim_whole).all()
 
 
 def test_end2end_cnv_w1a1_verify_all():
@@ -231,12 +231,12 @@ def test_end2end_cnv_w1a1_verify_all():
     parent_model = ModelWrapper(build_dir + "/end2end_cnv_w1a1_dataflow_parent.onnx")
     iname = parent_model.graph.input[0].name
     oname = parent_model.graph.output[0].name
-    # produce results with npysim
+    # produce results with cppsim
     sdp_node = parent_model.get_nodes_by_op_type("StreamingDataflowPartition")[0]
     sdp_node = getCustomOp(sdp_node)
-    sdp_node.set_nodeattr("model", build_dir + "/end2end_cnv_w1a1_ipgen_npysim.onnx")
-    ret_npysim = execute_onnx(parent_model, {iname: x}, True)
-    y_npysim = ret_npysim[oname]
+    sdp_node.set_nodeattr("model", build_dir + "/end2end_cnv_w1a1_ipgen_cppsim.onnx")
+    ret_cppsim = execute_onnx(parent_model, {iname: x}, True)
+    y_cppsim = ret_cppsim[oname]
     # produce results with node-by-node rtlsim
     sdp_node.set_nodeattr(
         "model", build_dir + "/end2end_cnv_w1a1_ipgen_nodebynode_rtlsim.onnx"
@@ -251,7 +251,7 @@ def test_end2end_cnv_w1a1_verify_all():
     os.environ["LIVENESS_THRESHOLD"] = "-1"
     ret_whole_rtlsim = execute_onnx(parent_model, {iname: x}, True)
     y_whole_rtlsim = ret_whole_rtlsim[oname]
-    assert np.isclose(y_golden, y_npysim).all()
+    assert np.isclose(y_golden, y_cppsim).all()
     assert np.isclose(y_golden, y_nodebynode_rtlsim).all()
     assert np.isclose(y_golden, y_whole_rtlsim).all()
     assert np.argmax(y_golden) == 3
@@ -316,7 +316,7 @@ def test_end2end_cnv_w1a1_run_on_pynq():
         ip = os.environ["PYNQ_IP"]  # NOQA
         if ip == "":
             pytest.skip("PYNQ board IP address not specified")
-        # produce results with npysim
+        # produce results with cppsim
         sdp_node = parent_model.get_nodes_by_op_type("StreamingDataflowPartition")[0]
         sdp_node = getCustomOp(sdp_node)
         sdp_node.set_nodeattr("model", build_dir + "/end2end_cnv_w1a1_pynq_deploy.onnx")
diff --git a/tests/end2end/test_end2end_tfc_w1a1_throughput_test.py b/tests/end2end/test_end2end_tfc_w1a1_throughput_test.py
index 80bfb0ee07aa91e64b31961292d32c0006dc3627..b5f3f4e27ff24723db69f887cb7f1cce9c4df617 100644
--- a/tests/end2end/test_end2end_tfc_w1a1_throughput_test.py
+++ b/tests/end2end/test_end2end_tfc_w1a1_throughput_test.py
@@ -174,13 +174,13 @@ def test_end2end_tfc_w1a1_verify_dataflow_part():
     inp_name = model.graph.input[0].name
     out_name = model.graph.output[0].name
     inp_dict = {inp_name: x}
-    # npysim
+    # cppsim
     model = model.transform(PrepareCppSim())
     model = model.transform(CompileCppSim())
-    model = model.transform(SetExecMode("npysim"))
-    model.save(build_dir + "/end2end_tfc_w1a1_ipstitch_npysim.onnx")
-    ret_npysim = execute_onnx(model, inp_dict, True)
-    res_npysim = ret_npysim[out_name]
+    model = model.transform(SetExecMode("cppsim"))
+    model.save(build_dir + "/end2end_tfc_w1a1_ipstitch_cppsim.onnx")
+    ret_cppsim = execute_onnx(model, inp_dict, True)
+    res_cppsim = ret_cppsim[out_name]
     # node-by-node rtlsim
     model = model.transform(SetExecMode("rtlsim"))
     model = model.transform(PrepareRTLSim())
@@ -192,8 +192,8 @@ def test_end2end_tfc_w1a1_verify_dataflow_part():
     model.save(build_dir + "/end2end_tfc_w1a1_ipstitch_whole_rtlsim.onnx")
     ret_rtlsim_whole = execute_onnx(model, inp_dict, True)
     res_rtlsim_whole = ret_rtlsim_whole[out_name]
-    assert np.isclose(res_npysim, res_rtlsim_nodebynode).all()
-    assert np.isclose(res_npysim, res_rtlsim_whole).all()
+    assert np.isclose(res_cppsim, res_rtlsim_nodebynode).all()
+    assert np.isclose(res_cppsim, res_rtlsim_whole).all()
 
 
 def test_end2end_tfc_w1a1_verify_all():
@@ -212,12 +212,12 @@ def test_end2end_tfc_w1a1_verify_all():
     parent_model = ModelWrapper(build_dir + "/end2end_tfc_w1a1_dataflow_parent.onnx")
     iname = parent_model.graph.input[0].name
     oname = parent_model.graph.output[0].name
-    # produce results with npysim
+    # produce results with cppsim
     sdp_node = parent_model.get_nodes_by_op_type("StreamingDataflowPartition")[0]
     sdp_node = getCustomOp(sdp_node)
-    sdp_node.set_nodeattr("model", build_dir + "/end2end_tfc_w1a1_ipstitch_npysim.onnx")
-    ret_npysim = execute_onnx(parent_model, {iname: x}, True)
-    y_npysim = ret_npysim[oname]
+    sdp_node.set_nodeattr("model", build_dir + "/end2end_tfc_w1a1_ipstitch_cppsim.onnx")
+    ret_cppsim = execute_onnx(parent_model, {iname: x}, True)
+    y_cppsim = ret_cppsim[oname]
     # produce results with node-by-node rtlsim
     sdp_node.set_nodeattr(
         "model", build_dir + "/end2end_tfc_w1a1_ipstitch_nodebynode_rtlsim.onnx"
@@ -230,7 +230,7 @@ def test_end2end_tfc_w1a1_verify_all():
     )
     ret_whole_rtlsim = execute_onnx(parent_model, {iname: x}, True)
     y_whole_rtlsim = ret_whole_rtlsim[oname]
-    assert np.isclose(y_golden, y_npysim).all()
+    assert np.isclose(y_golden, y_cppsim).all()
     assert np.isclose(y_golden, y_nodebynode_rtlsim).all()
     assert np.isclose(y_golden, y_whole_rtlsim).all()
 
@@ -292,7 +292,7 @@ def test_end2end_tfc_w1a1_run_on_pynq():
         ip = os.environ["PYNQ_IP"]  # NOQA
         if ip == "":
             pytest.skip("PYNQ board IP address not specified")
-        # produce results with npysim
+        # produce results with cppsim
         sdp_node = parent_model.get_nodes_by_op_type("StreamingDataflowPartition")[0]
         sdp_node = getCustomOp(sdp_node)
         sdp_node.set_nodeattr("model", build_dir + "/end2end_tfc_w1a1_pynq_deploy.onnx")
diff --git a/tests/end2end/test_end2end_tfc_w1a2.py b/tests/end2end/test_end2end_tfc_w1a2.py
index 996d5dbccc30b5fb2382fda93feb108da7a32ee5..ecc0d48a6af37bc2bdd48f9306976aa8582ca1b0 100644
--- a/tests/end2end/test_end2end_tfc_w1a2.py
+++ b/tests/end2end/test_end2end_tfc_w1a2.py
@@ -166,13 +166,13 @@ def test_end2end_tfc_w1a2_verify_dataflow_part():
     inp_name = model.graph.input[0].name
     out_name = model.graph.output[0].name
     inp_dict = {inp_name: x}
-    # npysim
+    # cppsim
     model = model.transform(PrepareCppSim())
     model = model.transform(CompileCppSim())
-    model = model.transform(SetExecMode("npysim"))
-    model.save(build_dir + "/end2end_tfc_w1a2_ipstitch_npysim.onnx")
-    ret_npysim = execute_onnx(model, inp_dict, True)
-    res_npysim = ret_npysim[out_name]
+    model = model.transform(SetExecMode("cppsim"))
+    model.save(build_dir + "/end2end_tfc_w1a2_ipstitch_cppsim.onnx")
+    ret_cppsim = execute_onnx(model, inp_dict, True)
+    res_cppsim = ret_cppsim[out_name]
     # node-by-node rtlsim
     model = model.transform(SetExecMode("rtlsim"))
     model = model.transform(PrepareRTLSim())
@@ -184,8 +184,8 @@ def test_end2end_tfc_w1a2_verify_dataflow_part():
     model.save(build_dir + "/end2end_tfc_w1a2_ipstitch_whole_rtlsim.onnx")
     ret_rtlsim_whole = execute_onnx(model, inp_dict, True)
     res_rtlsim_whole = ret_rtlsim_whole[out_name]
-    assert np.isclose(res_npysim, res_rtlsim_nodebynode).all()
-    assert np.isclose(res_npysim, res_rtlsim_whole).all()
+    assert np.isclose(res_cppsim, res_rtlsim_nodebynode).all()
+    assert np.isclose(res_cppsim, res_rtlsim_whole).all()
 
 
 def test_end2end_tfc_w1a2_verify_all():
@@ -204,12 +204,12 @@ def test_end2end_tfc_w1a2_verify_all():
     parent_model = ModelWrapper(build_dir + "/end2end_tfc_w1a2_dataflow_parent.onnx")
     iname = parent_model.graph.input[0].name
     oname = parent_model.graph.output[0].name
-    # produce results with npysim
+    # produce results with cppsim
     sdp_node = parent_model.get_nodes_by_op_type("StreamingDataflowPartition")[0]
     sdp_node = getCustomOp(sdp_node)
-    sdp_node.set_nodeattr("model", build_dir + "/end2end_tfc_w1a2_ipstitch_npysim.onnx")
-    ret_npysim = execute_onnx(parent_model, {iname: x}, True)
-    y_npysim = ret_npysim[oname]
+    sdp_node.set_nodeattr("model", build_dir + "/end2end_tfc_w1a2_ipstitch_cppsim.onnx")
+    ret_cppsim = execute_onnx(parent_model, {iname: x}, True)
+    y_cppsim = ret_cppsim[oname]
     # produce results with node-by-node rtlsim
     sdp_node.set_nodeattr(
         "model", build_dir + "/end2end_tfc_w1a2_ipstitch_nodebynode_rtlsim.onnx"
@@ -222,7 +222,7 @@ def test_end2end_tfc_w1a2_verify_all():
     )
     ret_whole_rtlsim = execute_onnx(parent_model, {iname: x}, True)
     y_whole_rtlsim = ret_whole_rtlsim[oname]
-    assert np.isclose(y_golden, y_npysim).all()
+    assert np.isclose(y_golden, y_cppsim).all()
     assert np.isclose(y_golden, y_nodebynode_rtlsim).all()
     assert np.isclose(y_golden, y_whole_rtlsim).all()
 
@@ -284,7 +284,7 @@ def test_end2end_tfc_w1a2_run_on_pynq():
         ip = os.environ["PYNQ_IP"]  # NOQA
         if ip == "":
             pytest.skip("PYNQ board IP address not specified")
-        # produce results with npysim
+        # produce results with cppsim
         sdp_node = parent_model.get_nodes_by_op_type("StreamingDataflowPartition")[0]
         sdp_node = getCustomOp(sdp_node)
         sdp_node.set_nodeattr("model", build_dir + "/end2end_tfc_w1a2_pynq_deploy.onnx")
diff --git a/tests/end2end/test_end2end_tfc_w2a2.py b/tests/end2end/test_end2end_tfc_w2a2.py
index fa0a3db2dd563534dffaacb9466a0e48e813e7ac..8c13352d9e9d146d58d76b1cf1e17878f27513f5 100644
--- a/tests/end2end/test_end2end_tfc_w2a2.py
+++ b/tests/end2end/test_end2end_tfc_w2a2.py
@@ -166,13 +166,13 @@ def test_end2end_tfc_w2a2_verify_dataflow_part():
     inp_name = model.graph.input[0].name
     out_name = model.graph.output[0].name
     inp_dict = {inp_name: x}
-    # npysim
+    # cppsim
     model = model.transform(PrepareCppSim())
     model = model.transform(CompileCppSim())
-    model = model.transform(SetExecMode("npysim"))
-    model.save(build_dir + "/end2end_tfc_w2a2_ipstitch_npysim.onnx")
-    ret_npysim = execute_onnx(model, inp_dict, True)
-    res_npysim = ret_npysim[out_name]
+    model = model.transform(SetExecMode("cppsim"))
+    model.save(build_dir + "/end2end_tfc_w2a2_ipstitch_cppsim.onnx")
+    ret_cppsim = execute_onnx(model, inp_dict, True)
+    res_cppsim = ret_cppsim[out_name]
     # node-by-node rtlsim
     model = model.transform(SetExecMode("rtlsim"))
     model = model.transform(PrepareRTLSim())
@@ -184,8 +184,8 @@ def test_end2end_tfc_w2a2_verify_dataflow_part():
     model.save(build_dir + "/end2end_tfc_w2a2_ipstitch_whole_rtlsim.onnx")
     ret_rtlsim_whole = execute_onnx(model, inp_dict, True)
     res_rtlsim_whole = ret_rtlsim_whole[out_name]
-    assert np.isclose(res_npysim, res_rtlsim_nodebynode).all()
-    assert np.isclose(res_npysim, res_rtlsim_whole).all()
+    assert np.isclose(res_cppsim, res_rtlsim_nodebynode).all()
+    assert np.isclose(res_cppsim, res_rtlsim_whole).all()
 
 
 def test_end2end_tfc_w2a2_verify_all():
@@ -204,12 +204,12 @@ def test_end2end_tfc_w2a2_verify_all():
     parent_model = ModelWrapper(build_dir + "/end2end_tfc_w2a2_dataflow_parent.onnx")
     iname = parent_model.graph.input[0].name
     oname = parent_model.graph.output[0].name
-    # produce results with npysim
+    # produce results with cppsim
     sdp_node = parent_model.get_nodes_by_op_type("StreamingDataflowPartition")[0]
     sdp_node = getCustomOp(sdp_node)
-    sdp_node.set_nodeattr("model", build_dir + "/end2end_tfc_w2a2_ipstitch_npysim.onnx")
-    ret_npysim = execute_onnx(parent_model, {iname: x}, True)
-    y_npysim = ret_npysim[oname]
+    sdp_node.set_nodeattr("model", build_dir + "/end2end_tfc_w2a2_ipstitch_cppsim.onnx")
+    ret_cppsim = execute_onnx(parent_model, {iname: x}, True)
+    y_cppsim = ret_cppsim[oname]
     # produce results with node-by-node rtlsim
     sdp_node.set_nodeattr(
         "model", build_dir + "/end2end_tfc_w2a2_ipstitch_nodebynode_rtlsim.onnx"
@@ -222,7 +222,7 @@ def test_end2end_tfc_w2a2_verify_all():
     )
     ret_whole_rtlsim = execute_onnx(parent_model, {iname: x}, True)
     y_whole_rtlsim = ret_whole_rtlsim[oname]
-    assert np.isclose(y_golden, y_npysim).all()
+    assert np.isclose(y_golden, y_cppsim).all()
     assert np.isclose(y_golden, y_nodebynode_rtlsim).all()
     assert np.isclose(y_golden, y_whole_rtlsim).all()
 
@@ -284,7 +284,7 @@ def test_end2end_tfc_w2a2_run_on_pynq():
         ip = os.environ["PYNQ_IP"]  # NOQA
         if ip == "":
             pytest.skip("PYNQ board IP address not specified")
-        # produce results with npysim
+        # produce results with cppsim
         sdp_node = parent_model.get_nodes_by_op_type("StreamingDataflowPartition")[0]
         sdp_node = getCustomOp(sdp_node)
         sdp_node.set_nodeattr("model", build_dir + "/end2end_tfc_w2a2_pynq_deploy.onnx")
diff --git a/tests/fpgadataflow/test_code_gen_trafo.py b/tests/fpgadataflow/test_code_gen_trafo.py
index 129d9ae162f1a170037ddd0dfa2b06361cc9be94..1228a9c79608a1c7eb44900ddb7df54ed900a3c2 100644
--- a/tests/fpgadataflow/test_code_gen_trafo.py
+++ b/tests/fpgadataflow/test_code_gen_trafo.py
@@ -79,7 +79,7 @@ def test_code_gen_trafo():
 
     model = model.transform(PrepareCppSim())
     for node in model.graph.node:
-        code_gen_attribute = util.get_by_name(node.attribute, "code_gen_dir_npysim")
+        code_gen_attribute = util.get_by_name(node.attribute, "code_gen_dir_cppsim")
         tmp_dir = code_gen_attribute.s.decode("UTF-8")
         assert os.path.isdir(
             tmp_dir
diff --git a/tests/fpgadataflow/test_convert_to_hls_layers_cnv.py b/tests/fpgadataflow/test_convert_to_hls_layers_cnv.py
index de0cf55aff2adf7fce1b3c95f9b774465d923932..220f8a7966a146f954a7fcb3f32058e231b83e23 100644
--- a/tests/fpgadataflow/test_convert_to_hls_layers_cnv.py
+++ b/tests/fpgadataflow/test_convert_to_hls_layers_cnv.py
@@ -115,7 +115,7 @@ def test_convert_to_hls_layers_cnv_w1a1():
     # model.save("cnv-pre-compile.onnx")
     model = model.transform(PrepareCppSim())
     model = model.transform(CompileCppSim())
-    model = model.transform(SetExecMode("npysim"))
+    model = model.transform(SetExecMode("cppsim"))
     # model.save("cnv-post-compile.onnx")
     produced_ctx = oxe.execute_onnx(model, input_dict, True)
     produced = produced_ctx[model.graph.output[0].name]
diff --git a/tests/fpgadataflow/test_convert_to_hls_layers_fc.py b/tests/fpgadataflow/test_convert_to_hls_layers_fc.py
index 04af69077b0b3b2d4aeef02ce434ceb4c4684c72..b7dea03797bc5de5e7517d0d8b816c438027008b 100644
--- a/tests/fpgadataflow/test_convert_to_hls_layers_fc.py
+++ b/tests/fpgadataflow/test_convert_to_hls_layers_fc.py
@@ -109,7 +109,7 @@ def test_convert_to_hls_layers_tfc_w1a1():
 
     model = model.transform(PrepareCppSim())
     model = model.transform(CompileCppSim())
-    model = model.transform(SetExecMode("npysim"))
+    model = model.transform(SetExecMode("cppsim"))
 
     raw_i = get_data("finn", "data/onnx/mnist-conv/test_data_set_0/input_0.pb")
     input_tensor = onnx.load_tensor_from_string(raw_i)
@@ -173,7 +173,7 @@ def test_convert_to_hls_layers_tfc_w1a2():
     fc3w.set_nodeattr("PE", 10)
     model = model.transform(PrepareCppSim())
     model = model.transform(CompileCppSim())
-    model = model.transform(SetExecMode("npysim"))
+    model = model.transform(SetExecMode("cppsim"))
     raw_i = get_data("finn", "data/onnx/mnist-conv/test_data_set_0/input_0.pb")
     input_tensor = onnx.load_tensor_from_string(raw_i)
     # run using FINN-based execution
diff --git a/tests/fpgadataflow/test_fpgadataflow_convinputgenerator.py b/tests/fpgadataflow/test_fpgadataflow_convinputgenerator.py
index 067bc228caa6320f1fc505833077e8b63463d9b1..02a9acae5e0e90d2a8dfa7d4d4afb03aa11f4239 100644
--- a/tests/fpgadataflow/test_fpgadataflow_convinputgenerator.py
+++ b/tests/fpgadataflow/test_fpgadataflow_convinputgenerator.py
@@ -134,7 +134,7 @@ def prepare_inputs(input_tensor):
 # Stride
 @pytest.mark.parametrize("stride", [1, 2])
 # execution mode
-@pytest.mark.parametrize("exec_mode", ["npysim", "rtlsim"])
+@pytest.mark.parametrize("exec_mode", ["cppsim", "rtlsim"])
 # input channel parallelism ("SIMD")
 @pytest.mark.parametrize("simd", [1, 2])
 def test_fpgadataflow_slidingwindow(idt, k, ifm_dim, ifm_ch, stride, exec_mode, simd):
@@ -145,8 +145,8 @@ def test_fpgadataflow_slidingwindow(idt, k, ifm_dim, ifm_ch, stride, exec_mode,
         k, ifm_ch, ifm_dim, ofm_dim, simd, stride, idt
     )
 
-    if exec_mode == "npysim":
-        model = model.transform(SetExecMode("npysim"))
+    if exec_mode == "cppsim":
+        model = model.transform(SetExecMode("cppsim"))
         model = model.transform(PrepareCppSim())
         model = model.transform(CompileCppSim())
     elif exec_mode == "rtlsim":
diff --git a/tests/fpgadataflow/test_fpgadataflow_fclayer.py b/tests/fpgadataflow/test_fpgadataflow_fclayer.py
index f0484895da1ee9fb9690398a6a4d28df832912cb..416d96d5dbfa1125d878eb8339ae38f5d572d1ce 100644
--- a/tests/fpgadataflow/test_fpgadataflow_fclayer.py
+++ b/tests/fpgadataflow/test_fpgadataflow_fclayer.py
@@ -149,7 +149,7 @@ def prepare_inputs(input_tensor, idt, wdt):
 @pytest.mark.parametrize("mw", [16])
 # HLS matrix height (output features)
 @pytest.mark.parametrize("mh", [16])
-def test_fpgadataflow_fclayer_npysim(mem_mode, idt, wdt, act, nf, sf, mw, mh):
+def test_fpgadataflow_fclayer_cppsim(mem_mode, idt, wdt, act, nf, sf, mw, mh):
     if nf == -1:
         nf = mh
     if sf == -1:
@@ -190,7 +190,7 @@ def test_fpgadataflow_fclayer_npysim(mem_mode, idt, wdt, act, nf, sf, mw, mh):
         # lookup op_type in registry of CustomOps
         inst = getCustomOp(node)
         inst.set_nodeattr("mem_mode", mem_mode)
-    model = model.transform(SetExecMode("npysim"))
+    model = model.transform(SetExecMode("cppsim"))
     model = model.transform(PrepareCppSim())
     model = model.transform(CompileCppSim())
     # prepare input data
@@ -215,7 +215,7 @@ def test_fpgadataflow_fclayer_npysim(mem_mode, idt, wdt, act, nf, sf, mw, mh):
 
     y_produced = y_produced.reshape(y_expected.shape)
 
-    assert (y_produced == y_expected).all(), "npysim failed"
+    assert (y_produced == y_expected).all(), "cppsim failed"
 
 
 # mem_mode: const or decoupled
diff --git a/tests/fpgadataflow/test_layer_streaming_maxpool_batch.py b/tests/fpgadataflow/test_layer_streaming_maxpool_batch.py
index 6d4b80671f178ab330668e7f9bf52df7a2e4c255..ac4ab33469c7720c3d7b9f30f5d13be888e1439d 100644
--- a/tests/fpgadataflow/test_layer_streaming_maxpool_batch.py
+++ b/tests/fpgadataflow/test_layer_streaming_maxpool_batch.py
@@ -120,7 +120,7 @@ def prepare_inputs(input_tensor):
 # input channels
 @pytest.mark.parametrize("ifm_ch", [1, 2])  # , 2, 3, 4])
 # execution mode
-@pytest.mark.parametrize("exec_mode", ["rtlsim", "npysim"])
+@pytest.mark.parametrize("exec_mode", ["rtlsim", "cppsim"])
 def test_fpgadataflow_streamingmaxpool(idt, k, ifm_dim, ifm_ch, exec_mode):
     stride = k
     ofm_dim = int(((ifm_dim - k) / stride) + 1)
@@ -136,8 +136,8 @@ def test_fpgadataflow_streamingmaxpool(idt, k, ifm_dim, ifm_ch, exec_mode):
 
     model = make_single_streamingmaxpool_modelwrapper(k, ifm_ch, ifm_dim, ofm_dim, idt)
 
-    if exec_mode == "npysim":
-        model = model.transform(SetExecMode("npysim"))
+    if exec_mode == "cppsim":
+        model = model.transform(SetExecMode("cppsim"))
         model = model.transform(PrepareCppSim())
         model = model.transform(CompileCppSim())
     elif exec_mode == "rtlsim":