Fine-Tuning BERT for Downstream Tasks
Learn about fine-tuning the pre-trained BERT model for down streaming tasks and how it differs from the pre-trained BERT model.
Let's learn how to fine-tune the pre-trained BERT model for downstream tasks. Note that fine-tuning implies that we are not training BERT from scratch; instead, we are using the pre-trained BERT and updating its weights according to our task.
Downstream tasks
We will learn how to fine-tune the pre-trained BERT model for the following downstream tasks:
Text classification
Natural language inference (NLI)
Named entity recognition (NER)
Question-answering
Text classification
Let's learn how to fine-tune the pre-trained BERT model for a text classification task. Say we are performing sentiment analysis. In the sentiment analysis task, our goal is to classify whether a sentence is positive or negative. Suppose we have a dataset containing sentences along with their labels.
Preprocessing the input
Consider a sentence: 'I love Paris'. First, we tokenize the sentence, add the [CLS] token at the beginning, and add the [SEP] token at the end of the sentence. Then, we feed the tokens as input to the pre-trained BERT model and get the embeddings of all the tokens.
Getting the embedding of the sentence
We ignore the ...