...

BERTje for Dutch

Learn about BERTje and how to use it for the next sentence prediction task.

We'll cover the following...

Next sentence prediction with BERTje
Coding playground

BERTje is the pre-trained monolingual BERT model for the Dutch language from the University of Groningen. The BERTje model is pre-trained using MLM and sentence order prediction (SOP) tasks with whole word masking (WWM).

The BERTje model is trained using several Dutch corpora, including TwNC (a Dutch news corpus), SoNAR-500 (a multi-genre reference corpus), Dutch Wikipedia text, web news, and books. The model has been pre-trained for about 1 million iterations. The pre-trained BERTje model can be downloaded from GitHub. It is also compatible with the Hugging Face's transformers library, so we can use the pre-trained BERTje model directly with the transformers library.

Next sentence prediction with BERTje

Let's now see how to use the pre-trained BERTje model for the NSP task. That is, we feed two sentences, A and B, to the model and predict whether sentence B follows on from sentence A. First, let's import the necessary modules:

Access this course and 1400+ top-rated courses and projects.

Preview Free Lessons→

Preview Free Lessons

Before We Start

Starting Off with BERT

A Primer on Transformers

Understanding the BERT Model

Getting Hands-On with BERT

Exploring BERT Variants

Different BERT Variants

BERT Variants—Based on Knowledge Distillation

Applications of BERT

Exploring BERTSUM for Text Summarization

Applying BERT to Other Languages

Exploring Sentence and Domain-Specific BERT

Working with VideoBERT, BART, and More

Conclusion

BERTje for Dutch

Next sentence prediction with BERTje