...

SpanBERT: Exploring Architecture

Learn about the architecture of the SpanBERT and how to perform the question-answering task with it.

We'll cover the following...

The span boundary objective
Predicting the masked token
The masked language modeling objective
The loss function of SpanBERT model
Example: Question-answering with SpanBERT
Coding playground

In SpanBERT, we mask a contiguous span of tokens in the sentence. Let $x_s$ and $x_e$ be the start and end position of the masked tokens, respectively. We feed the tokens to SpanBERT and it returns the representation of all the tokens. The representation of token $i$ is represented as $R_i$ . The representation of the tokens in the span boundary is denoted as $R_{s-1}$ and $R_{e+1}$ .

The span boundary objective

Let's first look at the SBO. To predict the masked token, $x_i$ , we use three values, which are the representation of the tokens in the span boundary ( $R_{s-1}$ and $R_{e+1}$ ), and the position embedding of the masked token ( $p_{i-s+1}$ ). Okay, how exactly do we predict the masked token with these three values? First, we create a new representation called $z_i$ using a function $f(⋅)$ , with these three values as shown:

Access this course and 1400+ top-rated courses and projects.

Preview Free Lessons→

Preview Free Lessons

Before We Start

Starting Off with BERT

A Primer on Transformers

Understanding the BERT Model

Getting Hands-On with BERT

Exploring BERT Variants

Different BERT Variants

BERT Variants—Based on Knowledge Distillation

Applications of BERT

Exploring BERTSUM for Text Summarization

Applying BERT to Other Languages

Exploring Sentence and Domain-Specific BERT

Working with VideoBERT, BART, and More

Conclusion

SpanBERT: Exploring Architecture

The span boundary objective