Distillation of Embedding and Prediction Layer
Learn about the distillation of the embedding and prediction layer of Tiny BERT.
We'll cover the following
Embedding layer distillation
In embedding layer distillation, we transfer knowledge from the embedding layer of the teacher to the embedding layer of the student. Let
Get hands-on with 1400+ tech skills courses.