...
/Distillation: The BYOL Algorithm
Distillation: The BYOL Algorithm
Learn about self-supervised learning via distillation and get an overview of the BYOL algorithm.
We'll cover the following...
Distillation as similarity maximization
As shown in the figure below, distillation, in general, refers to transferring knowledge from a fixed (usually large) model known as teacher
Distillation methods can also be seen as similarity maximization–based methods. Just like contrastive learning and clustering, distillation aims to prevent trivial solutions to