...

/

Model Optimization and Quantization

Model Optimization and Quantization

Learn to optimize TF Lite models by applying posttraining quantization.

The TF Lite converter generates lightweight TF models suitable for resource-constrained mobile and edge devices. We can make the TF Lite models even more compact and fast by applying some optimization and quantization techniques at the cost of a little reduction in model performance. Let’s discuss the process of quantization and model optimization techniques offered by the TF Lite framework.

Quantization

Quantization is a procedure to map input values represented in a larger set to the output values in a relatively smaller set. The range of input values can be infinite (continuous) or finite (using a large number of bits to store numbers). The following figure shows the quantization of a continuous information source.

Press + to interact
Quantizing the information (red) by representing it in eight levels (blue)
Quantizing the information (red) by representing it in eight levels (blue)

Here, we use three bits to represent each quantized value (blue) and a total of 23=82^3=8 ...