diff --git a/docs/_posts/2020-03-27-brevitas-quartznet-release.md b/docs/_posts/2020-03-27-brevitas-quartznet-release.md
new file mode 100644
index 0000000000000000000000000000000000000000..6f1c70ad0f538036e53e7ba81260d563b77df330
--- /dev/null
+++ b/docs/_posts/2020-03-27-brevitas-quartznet-release.md
@@ -0,0 +1,88 @@
+---
+layout: post
+title:  "Quantized QuartzNet with Brevitas for efficient speech recognition"
+author: "Giuseppe Franco"
+---
+
+*Although not yet supported in the FINN, we are excited to show you how Brevitas and quantized neural network training techniques can be applied to models beyond image classification.*
+
+We are pleased to announce the release of quantized pre-trained models of [QuartzNet](https://arxiv.org/abs/1904.03288) for efficient speech recognition.
+They can be found at the [following link](https://github.com/Xilinx/brevitas/tree/master/examples/speech_to_text), with a brief
+explanation on how to test them.
+The quantized version of QuartzNet has been trained using [Brevitas](https://github.com/Xilinx/brevitas), an experimental library for quantization-aware training.
+
+QuartzNet, whose structure can be seen in Fig. 1, is a convolution-based speech-to-text network, based on a similar structure as [Jasper](https://arxiv.org/abs/1904.03288).
+
+| <img src="https://xilinx.github.io/finn/img/QuartzNet.png" alt="QuartzNet Structure" title="QuartzNet Structure" width="450" height="500" align="center"/>|
+| :---:|
+| *Fig. 1 QuartzNet Model, [source](https://arxiv.org/abs/1910.10261)* |
+
+The starting point is the mel-spectrogram representation of the input audio file.
+Through repeated base building blocks of 1D Convolutions (1D-Conv), Batch-Normalizations (BN), and ReLU with residual connections,
+QuartzNet is able to reconstruct the underlying text.
+The main difference with respect to Jasper is the use of Depthwise and Pointwise 1D-Conv (Fig. 2a), instead of 'simple' 1D-Conv (Fig. 2b).
+Thanks to this structure, QuartzNet is able to achieve better performance in terms of Word Error Rate (WER) compared to Jasper,
+using *only* 19.9 M parameters, compared to 333M parameters of Jasper.
+
+Moreover, the authors proposed a grouped-pointwise convolution strategy that allows to greatly reduce the numbers of parameters,
+down to 8.7M, with a little degradation in accuracy.
+
+| <img src="https://xilinx.github.io/finn/img/quartzPic1.png" alt="QuartzNet block" title="QuartzNet block" width="130" height="220" align="center"/> | <img src="https://xilinx.github.io/finn/img/JasperVertical4.png" alt="Jasper block" title="Jasper block" width="130" height="220" align="center"/>|
+| :---:|:---:|
+| *Fig. 2a QuartzNet Block, [source](https://arxiv.org/abs/1910.10261)* | *Fig. 2b Jasper Block [source](https://arxiv.org/abs/1904.03288)*  |
+
+
+The authors of QuartzNet proposes different BxR configurations. Each B<sub>i</sub> block consist of the same base building block described above,
+repeated R times.
+Different BxR configurations have been trained on several different datasets (Wall Street Journal,
+LibriSpeech + Mozilla Common Voice, LibriSpeech only).
+
+For our quantization experiments, we focus on the 15x5 variant trained on LibriSpeech with spec-augmentation without grouped convolutions.
+More detail about this configuration can be found in the paper and on a [related discussion with the authors](https://github.com/NVIDIA/NeMo/issues/230).
+
+Started from the [official implementation](https://github.com/NVIDIA/NeMo/blob/master/examples/asr/quartznet.py),
+the first step was to implement a quantized version of the topology in Brevitas, using quantized convolutions and activations.
+
+After implementing the quantized version, the second step was to re-train the model, starting
+from the [pre-trained models](https://ngc.nvidia.com/catalog/models/nvidia:quartznet_15x5_ls_sp)
+kindly released by the authors.
+
+We focused on three main quantization configurations. Two configurations at 8 bit, with per-tensor and per-channel scaling,
+and one configuration at 4 bit, with per-channel scaling.
+
+We compare our results with the one achieved by the authors, not only in terms of pure WER, but also the parameter's memory footprint,
+and the number of operations performed. Note that the WER is always based on greedy decoding. The results can be seen in Fig. 3a and Fig 3b,
+and are summed up in Table 1.
+
+| Configuration | Word Error Rate (WER) | Memory Footprint (MegaByte) | Mega MACs |
+| :-----------: | :-------------------: | :-------------------------: | :-------: |
+| FP 300E, 1G   | 11.58%                | 37.69                       | 1658.54   |
+| FP 400E, 1G   | 11.08%                | 37.69                       | 1658.54   |
+| FP 1500E, 1G  | 10.78%                | 37.69                       | 1658.54   |
+| FP 300E, 2G   | 12.52%                | 24.06                       | 1058.75   |
+| FP 300E, 4G   | 13.48%                | 17.25                       |  758.86   |
+| 8 bit, 1G Per-Channel scaling| 10.98% | 18.58                       |  414.63   |
+| 8 bit, 1G Per-Tensor scaling | 11.03% | 18.58                       |  414.63   |
+| 4 bit, 1G Per-Channel scaling| 12.00% |  9.44                       |  104.18   |
+
+| <img src="https://xilinx.github.io/finn/img/WERMB.png" alt="WERvsMB" title="WERvsMB" width="500" height="300" align="center"/> | <img src="https://xilinx.github.io/finn/img/WERNops.png" alt="WERvsMACs" title="WERvsMACs" width="500" height="300" align="center"/>|
+| :---:|:---:|
+| *Fig. 3a Memory footprint over WER on LibriSpeech dev-other* | *Fig. 3b Number of MACs Operations over WER on LibriSpeech dev-other*  |
+
+In evaluating the memory footprint, we consider half-precision (16 bit) Floating Point (FP) numbers for the original QuartzNet.
+As we can see on Fig. 3a, the quantized implementations are able to achieve comparable accuracy compared to the corresponding floating-point verion,
+while greatly reducing the memory occupation. In the graph, the terms <em>E</em> stands for Epochs, while <em>G</em> for Groups, referring
+to the numbers of groups used for the grouped convolutions.
+In case of our 4 bit implementation, the first and last layer are left at 8 bit, but this is taken in account both in the computation
+of the memory occupation and of the number of operations.
+Notice how the 4 bit version is able to greatly reduce the memory footprint of the network compared to the grouped convolution variants, while still granting better accuracy.
+
+
+For comparing accuracy against the number of multiply-accumulate (MAC), we consider 16 bit floating-point multiplications as 16 bit integer multiplications.
+This means that we are greatly underestimating the complexity of operations performed in the original floating-point QuartzNet model.
+Assuming a n^2 growth in the cost of integer multiplication, we consider a 4 bit MAC 16x less expensive than a 16 bit one.
+The number of MACs in the Fig. 2b is normalized with respect to 16 bit.
+Also in this case, it is clear to see that the quantized versions are able to greatly reduce the amount of operations required,
+with little-to-none degradation in accuracy. In particular, the 8 bit versions are already able to have a better WER and lower amount
+of MACs compared to the grouped convolutions, and this is confirmed also by the 4 bit version, with a little degradation in terms of
+WER.
diff --git a/docs/img/JasperVertical4.png b/docs/img/JasperVertical4.png
new file mode 100644
index 0000000000000000000000000000000000000000..28481924684ba9e754842a4be4854c4225dc0489
Binary files /dev/null and b/docs/img/JasperVertical4.png differ
diff --git a/docs/img/QuartzNet.png b/docs/img/QuartzNet.png
new file mode 100644
index 0000000000000000000000000000000000000000..f62cb31fdaae661039348ed93d644f5bb4fa8c10
Binary files /dev/null and b/docs/img/QuartzNet.png differ
diff --git a/docs/img/WERMB.png b/docs/img/WERMB.png
new file mode 100644
index 0000000000000000000000000000000000000000..5b5557bd1900fd030eed971164e04da8d44e9699
Binary files /dev/null and b/docs/img/WERMB.png differ
diff --git a/docs/img/WERNops.png b/docs/img/WERNops.png
new file mode 100644
index 0000000000000000000000000000000000000000..513ea0060c105bf6033075922793f96986dd8deb
Binary files /dev/null and b/docs/img/WERNops.png differ
diff --git a/docs/img/quartzPic1.png b/docs/img/quartzPic1.png
new file mode 100644
index 0000000000000000000000000000000000000000..ab0bd772f978703590c87ba3a78082fda7215227
Binary files /dev/null and b/docs/img/quartzPic1.png differ