Skip to content
Snippets Groups Projects
user avatar
Yaman Umuroglu authored
* [Build] add a deployment_package step

* [Build] typo fix in build examples

* [Build] fix deployment_package step

* [Analysis] ensure python integers in fpgadataflow analysis passes

* [Build] report generation and other minor improvements

* [Docs] update build_dataflow docs

* [Build] latency est. fix

* [Build] check if fold config is None

* [Build] add ooc synthesis step

* [Deps] update finn-base

* [Util] out_of_context_synth: remove remote, use launch_process_helper

* [Build] include all outputs in examples configs

* [Docs] update build flow docs

* [Deps] update finn-base

* [Util] bugfix in launch_process_helper call

* [Docker] use interactive mode for builds

* [Build] enable pdb debugging for builds

* [Refactor] move build functions to own submodule

* [Test] build_dataflow: fix expected files

* [Build] report estimated resource total

* [Infra] remove old eggs

* [HLSCustomOp] introduce get_op_counts

only implemented for MVAU and VVAU for now

* [HLSCustomOp] extend get_op_counts to include params too

* [Analysis] introduce op_and_param_counts pass

* [Build] generate op/param counts as part of estimates + add doc

* [HLSCustomOp] assert if ap_int_max_w is too large

* [StreamingFC] fix ap_int_max_w calculation

* [Build] minor fix in step_generate_estimate_reports

* [StreamingFC] enable decoupled URAM weights

* [StreamingFC export 0-valued .dat for decoupled uram

* [Zynq] bugfix: AXI MM and lite IF counts were switched around

* [Zynq] support wiring up multiple AXI lites in shell

* [Deps] update finn-base

* [HLSCustomOp] introduce uram_efficiency_estimation

* [StreamingFC] implement uram eff est

* [FIFO] fix ip packaging problems

* [Thres] better integer check for thresholds

* [HLSCustomOp] rework infer_node_datatype to be more flexible

allow re-setting of inputDataType if it changed during datatype
inference

* [Thres] bugfix in integer thres check

* [Thres] bugfix in integer thres check

* [Docker] relax instance name, only fwd ports in notebook mode

* [Driver] generate runtime weight files for appropriate layers

* [Driver] draft a first version of load_runtime_weights

* [Driver] fixes&enhancements to load_runtime_weights

* [Driver] typo fix in split

* [Test] use runtime weights for tfc end2end

* [Build] bugfix in ooc step

* [Driver] also handle runtime-writable thresholds

* [Thresholding] implement get_op_and_param_counts

* [Test] use tfc-w1a1 as standalone thresholds end2end testcase

* [Build] add option for standalone thresholds

* [Driver] update comments

* [Driver] overhaul driver, split up template

* [Test] fix test_res_estimate expectation

* [Driver] fix varname in template
3cfa1896
History

Fast, Scalable Quantized Neural Network Inference on FPGAs

drawing

Gitter ReadTheDocs

FINN is an experimental framework from Xilinx Research Labs to explore deep neural network inference on FPGAs. It specifically targets quantized neural networks, with emphasis on generating dataflow-style architectures customized for each network. The resulting FPGA accelerators can yield very high classification rates, or conversely be run with a slow clock for very low power consumption. The framework is fully open-source in order to give a higher degree of flexibility, and is intended to enable neural network research spanning several layers of the software/hardware abstraction stack.

For more general information about FINN, please visit the project page, check out the publications or some of the demos.

Getting Started

Please see the Getting Started page for more information on requirements, installation, and how to run FINN in different modes. Due to the complex nature of the dependencies of the project, we only support Docker-based deployment at this time.

What's New in FINN?

  • 2020-09-21: v0.4b (beta) is released. Read more on the release blog post.
  • 2020-05-08: v0.3b (beta) is released, with initial support for convolutions, parallel transformations, more flexible memory allocation for MVAUs, throughput testing and many other smaller improvements and bugfixes. Read more on the release blog post.
  • 2020-04-15: FINN v0.2.1b (beta): use fixed commit versions for dependency repos, otherwise identical to 0.2b
  • 2020-02-28: FINN v0.2b (beta) is released, which is a clean-slate reimplementation of the framework. Currently only fully-connected networks are supported for the end-to-end flow. Please see the release blog post for a summary of the key features.

Documentation

You can view the documentation on readthedocs or build them locally using python setup.py doc from inside the Docker container. Additionally, there is a series of Jupyter notebook tutorials, which we recommend running from inside Docker for a better experience.

Community

We have a gitter channel where you can ask questions. You can use the GitHub issue tracker to report bugs, but please don't file issues to ask questions as this is better handled in the gitter channel.

We also heartily welcome contributions to the project, please check out the contribution guidelines and the list of open issues. Don't hesitate to get in touch over Gitter to discuss your ideas.

Citation

The current implementation of the framework is based on the following publications. Please consider citing them if you find FINN useful.

@article{blott2018finn,
  title={FINN-R: An end-to-end deep-learning framework for fast exploration of quantized neural networks},
  author={Blott, Michaela and Preu{\ss}er, Thomas B and Fraser, Nicholas J and Gambardella, Giulio and O’brien, Kenneth and Umuroglu, Yaman and Leeser, Miriam and Vissers, Kees},
  journal={ACM Transactions on Reconfigurable Technology and Systems (TRETS)},
  volume={11},
  number={3},
  pages={1--23},
  year={2018},
  publisher={ACM New York, NY, USA}
}

@inproceedings{finn,
author = {Umuroglu, Yaman and Fraser, Nicholas J. and Gambardella, Giulio and Blott, Michaela and Leong, Philip and Jahre, Magnus and Vissers, Kees},
title = {FINN: A Framework for Fast, Scalable Binarized Neural Network Inference},
booktitle = {Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays},
series = {FPGA '17},
year = {2017},
pages = {65--74},
publisher = {ACM}
}

Old version

We previously released an early-stage prototype of a toolflow that took in Caffe-HWGQ binarized network descriptions and produced dataflow architectures. You can find it in the v0.1 branch in this repository. Please be aware that this version is deprecated and unsupported, and the master branch does not share history with that branch so it should be treated as a separate repository for all purposes.