Pre-Training Strategies for the BERT Model
Learn about the different pre-training strategies used to train the BERT model.
Now that we've learned how to feed the input to BERT by converting it into embeddings and also how to tokenize the input using a WordPiece tokenizer, let's learn how to pre-train the BERT model.
Pre-training strategies
The BERT model is pre-trained on the following two tasks:
Masked language modeling
Next sentence prediction
Let's understand how the two aforementioned pre-training strategies work by looking at each in turn. Before diving directly into the masked language modeling task, first, let's understand how a language modeling task works.
Get hands-on with 1400+ tech skills courses.