...

Pre-Training the BERT Model

Learn how to apply different embeddings to the input sentence before feeding it as input to BERT.

We'll cover the following...

Input data representation

In this lesson, we will learn how to pre-train the BERT model. But what does pre-training mean? Say we have a model, m. First, we train the model m with a huge dataset for a particular task and save the trained model. Now, for a new task, instead of initializing a new model with random weights, we will initialize the model with the weights of our already trained model, m (pre-trained model). That is, since the model m is already trained on a huge dataset, instead of training a new model from scratch for a new task, we use the pre-trained model, m, and adjust (fine-tune) its weights according to the new task. This is a type of transfer learning.

Access this course and 1400+ top-rated courses and projects.

Preview Free Lessons→

Preview Free Lessons

Before We Start

Starting Off with BERT

A Primer on Transformers

Understanding the BERT Model

Getting Hands-On with BERT

Exploring BERT Variants

Different BERT Variants

BERT Variants—Based on Knowledge Distillation

Applications of BERT

Exploring BERTSUM for Text Summarization

Applying BERT to Other Languages

Exploring Sentence and Domain-Specific BERT

Working with VideoBERT, BART, and More

Conclusion

Pre-Training the BERT Model