...

Model Optimization and Quantization

Learn to optimize TF Lite models by applying posttraining quantization.

We'll cover the following...

Quantization
Quantization options
Dynamic range quantization
Full integer quantization
- Integer quantization with float input/output
- Integer only quantization
Float16 quantization

The TF Lite converter generates lightweight TF models suitable for resource-constrained mobile and edge devices. We can make the TF Lite models even more compact and fast by applying some optimization and quantization techniques at the cost of a little reduction in model performance. Let’s discuss the process of quantization and model optimization techniques offered by the TF Lite framework.

Quantization

Quantization is a procedure to map input values represented in a larger set to the output values in a relatively smaller set. The range of input values can be infinite (continuous) or finite (using a large number of bits to store numbers). The following figure shows the quantization of a continuous information source.

Press + to interact

Getting Started with Python

Machine Learning (ML) and Deep Learning (DL)

Customer Segmentation with K-Means Clustering

TensorFlow (TF)

Cats vs Dogs Classification with Convolutional Neural Networks

Dataset Processing Using TensorFlow

Keras: High-Level TF API

Diabetes Prediction Using Keras

Quick Start with Android Apps

TensorFlow (TF) Lite

Image Classification Apps Using TF Lite

Object Detection Apps Using TF Lite

Appendix

DL Model Using TF, Keras, and TF Lite

Model Optimization and Quantization

Quantization