image thumbnail

Deep Learning Toolbox Model Quantization Library

Quantize and Compress Deep Learning models


Updated 22 Sep 2021

Deep Learning Toolbox Model Quantization Library enables quantizing and compressing of your deep learning models. It provides instrumentation services that enable you to collect layer level data on the weights, activations and intermediate computations during the calibration step. Using instrumentation data, the library/add-on enables quantization of your model and it provides metrics to validate the accuracy of the quantized network.
The library/add-on enables an iterative workflow to optimize the quantization approach to meet the required accuracy. It provides heuristics to choose the right quantization strategy.
You can validate the quantized network and compare the accuracy against the single precision baseline.
The library/add-on provides a Quantization app that lets you analyze and visualize the instrumentation data to understand the tradeoff on the accuracy of quantizing the weights and biases of selected layers.
The library/add-on supports INT8 quantization for FPGAs and NVIDIA GPUs, for supported layers.
Please refer to the documentation here:
This hardware support package is functional for R2020a and beyond. Quantization of a neural network targeting GPUs requires the GPU Coder™ Interface for Deep Learning Libraries support package. R2020b adds support for quantization of a neural network targeting FPGAs and requires Deep Learning HDL Toolbox™.
Quantization Workflow Prerequisites can be found on this page:
If you have download or installation problems, please contact Technical Support -
MATLAB Release Compatibility
Created with R2020a
Compatible with R2020a to R2021b
Platform Compatibility
Windows macOS Linux

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!