Distillation Techniques for Pre-training and Fine-tuning
Learn about performing distillation in the pre-training and fine-tuning stages.
We'll cover the following...
In TinyBERT, we will use a two-stage learning framework as follows:
General distillation
Task-specific distillation
This two-stage learning framework enables the distillation in both the pre-training and fine-tuning stages. Let's take a look at how each of the stages works in detail.