Skip to content
Snippets Groups Projects
Commit 694407dd authored by auphelia's avatar auphelia
Browse files

Added fixed draft of multi-thresholding function

parent 0571ad84
No related branches found
No related tags found
No related merge requests found
......@@ -11,21 +11,21 @@ to be more modular, more usable and more open-source!
Over the past few years, the team at Xilinx Research Labs Ireland has done quite a bit of research of Quantized Neural Networks
(QNNs).
Starting with <a href="https://arxiv.org/abs/1612.07119">Binarized Neural Networks (BNNs) on FPGAs</a> back in 2016, we've since
Starting with <a href="https://arxiv.org/abs/1612.07119">Binarized Neural Networks (BNNs) on FPGAs</a> back in 2016, we've since
looked at many aspects of quantized deep learning, ranging from
at <a href ="https://arxiv.org/abs/1807.00301">better quantization methods</a> and
<a href="https://arxiv.org/abs/1709.06262">mixing quantization and pruning</a>,
<a href="https://arxiv.org/abs/1709.06262">mixing quantization and pruning</a>,
to <a href="https://arxiv.org/pdf/1807.10577.pdf">accuracy-throughput tradeoffs</a> and
<a href="https://arxiv.org/pdf/1807.04093.pdf">recurrent topologies</a>.
Although some <a href="https://github.com/Xilinx/BNN-PYNQ">demonstrators</a> of our work has been open source for some time,
Although some <a href="https://github.com/Xilinx/BNN-PYNQ">demonstrators</a> of our work has been open source for some time,
we want to take things a step further.
We love QNNs and the high-performance, high-efficiency dataflow accelerators we can build for them on Xilinx FPGAs, and we want you and
We love QNNs and the high-performance, high-efficiency dataflow accelerators we can build for them on Xilinx FPGAs, and we want you and
the FPGA/ML community to be able to do the same.
The (co-)design process for making this happen is actually quite involved, starting from customizing a neural network in a machine
learning framework, going through multiple design steps that involve many optimizations, HLS code generation and Vivado synthesis, and
learning framework, going through multiple design steps that involve many optimizations, HLS code generation and Vivado synthesis, and
ending up with an FPGA bitstream that you can deploy as part of some application.
Many of those steps require some manual effort, but having a modular, flexible solution stack to support you through this process is greatly
Many of those steps require some manual effort, but having a modular, flexible solution stack to support you through this process is greatly
helpful.
This is why we are rebulding our FINN solution stack from the ground-up to make it more modular, and we hope to build a community
around it that shares our excitement around QNNs for FPGAs.
......@@ -40,31 +40,31 @@ frameworks like <a href="http://llvm.org">LLVM</a>.
This stack breaks down the complex co-design problem into parts, and each layer focuses on a different sub-problem, consuming
the artifacts produced by the previous one.
The diagram on the left illustrates this briefly, and over the next few months we hope to make a first few QNNs go through all
the layers of this stack to produce cool FPGA dataflow accelerators.
the layers of this stack to produce cool FPGA dataflow accelerators.
In fact, some of these components are already available today for you to explore!
Let's have a look at the main parts:
* <b>Brevitas</b> is a PyTorch library that lets you do quantization-aware training. It gives you a set of `torch.nn` building
blocks to explore different forms of weight, activation and accumulator quantization schemes. You can also learn the bitwidths for
blocks to explore different forms of weight, activation and accumulator quantization schemes. You can also learn the bitwidths for
different layers with backpropagation! See the <a href="https://xilinx.github.io/brevitas/">Brevitas page</a> for more information.
* <b>Frontend</b>. Once you are happy with the accuracy of your quantized neural network in Brevitas, you'll be able to export it into a custom
* <b>Frontend</b>. Once you are happy with the accuracy of your quantized neural network in Brevitas, you'll be able to export it into a custom
<a href="https://onnx.ai">ONNX</a> representation that FINN uses internally to represent QNNs. More details about this custom ONNX
representation will be available in an upcoming blog post.
* The <b>FINN Compiler</b> will then import this ONNX representation, and go through several steps of optimizations such as the
<a href="https://arxiv.org/pdf/1709.04060.pdf">streamlining transform</a> to make the QNN simpler.
<a href="https://arxiv.org/pdf/1709.04060.pdf">streamlining transform</a> to make the QNN simpler.
* The <b>FPGA dataflow backend</b> will then convert the optimized QNN into a series of streaming HLS library calls. An important
part of the stack is the <a href="https://github.com/Xilinx/finn-hlslib">FINN HLS library</a>, which provides optimized Vivado HLS
part of the stack is the <a href="https://github.com/Xilinx/finn-hlslib">FINN HLS library</a>, which provides optimized Vivado HLS
descriptions of several common layer types (convolutions, thresholding, pooling...) found in QNNs.
* <b>Synthesis</b>. Once the HLS calls are generated, the next steps are to call Vivado HLS and Vivado to generate a bitstream for the target
Xilinx FPGA. We have plans to support Vivado IPI block design code generation as well for increased agility and modularity.
* <b>PYNQ deployment</b>. Finally, you will be able to use any of the supported <a href="http://www.pynq.io/">PYNQ</a> platforms to directly call the
generated accelerator from Python and integrate it with other functionality. Since FINN-generated dataflow accelerators expose
streaming interfaces, we think it will be exciting to use streaming-oriented Python frameworks such as
streaming interfaces, we think it will be exciting to use streaming-oriented Python frameworks such as
<a href="https://github.com/ray-project/ray">Ray</a> to create heterogeneous, high-performance task graphs incorporating QNNs.
### Getting started
More will be available in the coming weeks and months, but if you want to get your hands dirty there's already plenty to start with!
If you haven't done so already, we recommend starting with <a href="https://github.com/Xilinx/BNN-PYNQ">BNN-PYNQ</a> to see what
dataflow QNN accelerators look and feel like.
......@@ -72,4 +72,3 @@ representation will be available in an upcoming blog post.
put together a streaming pipeline with the <a href="https://github.com/Xilinx/finn-hlslib">FINN HLS library</a>.
We have also created a <a href="https://gitter.im/xilinx-finn/community">Gitter channel</a> to make it easier to get in touch with
the community, and hope to see many of you there! :)
......@@ -21,7 +21,7 @@ Depending on what you would like to do, we have different suggestions on where t
* **I want to try out prebuilt QNN accelerators on real hardware.** Head over to <a href="https://github.com/Xilinx/BNN-PYNQ" target="_blank">BNN-PYNQ</a> repository to try out some image
classification accelerators, or to <a href="https://github.com/Xilinx/LSTM-PYNQ" target="_blank">LSTM-PYNQ</a>
to try optical character recognition with LSTMs.
* **I want to train new quantized networks for FINN.** Check out <a href="https://github.com/Xilinx/brevitas">Brevitas</a>,
* **I want to train new quantized networks for FINN.** Check out <a href="https://github.com/Xilinx/brevitas">Brevitas</a>,
our PyTorch library for training quantized networks. The Brevitas-to-FINN part of the flow is coming soon!
* **I want to understand the computations involved in quantized inference.** Check out these Jupyter notebooks on <a href="https://github.com/maltanar/qnn-inference-examples">QNN inference</a>. This repo contains simple Numpy/Python layer implementations and a few pretrained QNNs for instructive purposes.
* **I want to understand how it all fits together.** Check out our [publications](#publications),
......
import numpy as np
def compare(value, threshold):
if value >= threshold:
res = 1.0
else:
res = 0.0
return res
def execute(v, thresholds):
# reshape inputs to enable channel-wise reading
vr = v.reshape((thresholds.shape[1], -1))
# calculate the channelinterval for the for loops
num_channels = thresholds.shape[0]
channel_interval = int(vr.shape[1] / num_channels)
# initiate output tensor
ret = np.zeros_like(vr)
# initiate helper variable i for channel-wise thresholding
i = -1
# iterate over thresholds channel-wise
for t in thresholds:
i += 1
# calculate the lower and upper limit in which elements belong to one channel
ce1_low_lim = i * channel_interval
ce1_up_lim = (i + 1) * channel_interval
# iterate in ascending order over the thresholds belonging to one channel
for c in range(thresholds.shape[1]):
for ce0 in range(vr.shape[0]):
for ce1 in range(ce1_low_lim, ce1_up_lim):
ret[ce0][ce1] += compare(vr[ce0][ce1], t[c])
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment