Question-Answering
Learn how to fine-tune the pre-trained BERT model for question answering tasks.
We'll cover the following...
In a question-answering task, we are given a question along with a paragraph containing an answer to the question. Our goal is to extract the answer from the paragraph for the given question. Now, let’s learn how to fine-tune the pre-trained BERT model to perform a question-answering task.
Input to the BERT model
The input to the BERT model will be a question-paragraph pair. That is, we feed a question and a paragraph containing the answer to the question to BERT and it has to extract the answer from the paragraph. So, essentially, BERT has to return the text span that contains the answer from the paragraph.
Example
Let's understand this with an example—consider the following question-paragraph pair:
Question: What is the immune system?
Paragraph: The immune system is a system of many biological structures and processes within an organism that protects against disease. To function properly, an immune system must detect a wide variety of agents, known as pathogens, from viruses to parasitic worms, and distinguish them from the organism's own healthy tissue.
Now, our model has to extract an answer from the paragraph; it essentially has to return the text span containing the answer. So, it should return the following:
Answer: a system of many biological structures and processes within an organism that protects against disease
Fine-tuning BERT for question-answering
To fine-tune the BERT model for our task, our model has to understand the starting and ending index of the text span containing the answer in the given paragraph. For example, take the question, 'What is the immune system?' If our model understands that the answer to this question starts from index 4 (a) and ends at index 21 (disease), then we can get the answer as shown here:
Paragraph: The immune system is a system of many biological structures and processes within an organism that protects against disease biological structures and processes within an organism that protects against disease. To function properly, an immune system must detect a wide variety of agents, known as pathogens, from viruses to parasitic worms, and distinguish them from the organism's own healthy tissue.
Finding the starting and ending index of the text span
Now, how do we find the starting and ending index of the text span containing the answer? If we get the probability of each token (word) in the paragraph of being the starting and ending token (word) of the answer, then we can easily extract the answer, right? Yes, but how we can achieve this? To do this, we use two vectors called the start vector