Input to the BERT model

The input to the BERT model will be a question-paragraph pair. That is, we feed a question and a paragraph containing the answer to the question to BERT and it has to extract the answer from the paragraph. So, essentially, BERT has to return the text span that contains the answer from the paragraph.

Example

Let's understand this with an example—consider the following question-paragraph pair:

Question: What is the immune system?

Paragraph: The immune system is a system of many biological structures and processes within an organism that protects against disease. To function properly, an immune system must detect a wide variety of agents, known as pathogens, from viruses to parasitic worms, and distinguish them from the organism's own healthy tissue.

Now, our model has to extract an answer from the paragraph; it essentially has to return the text span containing the answer. So, it should return the following:

Answer: a system of many biological structures and processes within an organism that protects against disease

Press + to interact

Fine-tuning BERT for question-answering

To fine-tune the BERT model for our task, our model has to understand the starting and ending index of the text span containing the answer in the given paragraph. For example, take the question, 'What is the immune system?' If our model understands that the answer to this question starts from index 4 (a) and ends at index 21 (disease), then we can get the answer as shown here:

Paragraph: The immune system is a system of many biological structures and processes within an organism that protects against disease biological structures and processes within an organism that protects against disease. To function properly, an immune system must detect a wide variety of agents, known as pathogens, from viruses to parasitic worms, and distinguish them from the organism's own healthy tissue.

Finding the starting and ending index of the text span

Now, how do we find the starting and ending index of the text span containing the answer? If we get the probability of each token (word) in the paragraph of being the starting and ending token (word) of the answer, then we can easily extract the answer, right? Yes, but how we can achieve this? To do this, we use two vectors called the start vector $S$ ...

Before We Start

Starting Off with BERT

A Primer on Transformers

Understanding the BERT Model

Getting Hands-On with BERT

Exploring BERT Variants

Different BERT Variants

BERT Variants—Based on Knowledge Distillation

Applications of BERT

Exploring BERTSUM for Text Summarization

Semantic Search with Transformers

Applying BERT to Other Languages

Exploring Sentence and Domain-Specific BERT

Working with VideoBERT, BART, and More

Conclusion

Similarity Detection in English Language Using RoBERTa

Question-Answering

Input to the BERT model

Example

Fine-tuning BERT for question-answering

Finding the starting and ending index of the text span