Pre-Training the BERT Model

Learn how to apply different embeddings to the input sentence before feeding it as input to BERT.

In this lesson, we will learn how to pre-train the BERT model. But what does pre-training mean? Say we have a model, m. First, we train the model m with a huge dataset for a particular task and save the trained model. Now, for a new task, instead of initializing a new model with random weights, we will initialize the model with the weights of our already trained model, m (pre-trained model). That is, since the model m is already trained on a huge dataset, instead of training a new model from scratch for a new task, we use the pre-trained model, m, and adjust (fine-tune) its weights according to the new task. This is a type of transfer learning.

Get hands-on with 1400+ tech skills courses.