...

FinBERT for Finnish

Learn about the FinBERT model, its configurations, and its performance.

We'll cover the following...

Configurations of FinBERT
Coding playground
Performance of FinBERT

FinBERT is the pre-trained BERT model for the Finnish language. FinBERT outperforms M-BERT on many downstream Finnish NLP tasks. We learned that M-BERT is trained using the Wikipedia text of 104 languages, but it comprises only 3% Finnish text. FinBERT is trained using the Finnish text from news articles, online discussions, and internet crawling. It consists of about 50K WordPiece vocabulary items, covering many Finnish words compared to M-BERT. This makes FinBERT better than M-BERT.

Configurations of FinBERT

The architecture of FinBERT is similar to the BERT-base model and has two configurations.

FinBERT-cased for cased text.
FinBERT-uncased for uncased text.

FinBERT is pre-trained on MLM and NSP tasks with a WWM scheme. We can download the pre-trained FinBERT model from GitHub. We can also use the pre-trained FinBERT model with the transformers library, as shown here:

Access this course and 1400+ top-rated courses and projects.

Preview Free Lessons→

Preview Free Lessons

Before We Start

Starting Off with BERT

A Primer on Transformers

Understanding the BERT Model

Getting Hands-On with BERT

Exploring BERT Variants

Different BERT Variants

BERT Variants—Based on Knowledge Distillation

Applications of BERT

Exploring BERTSUM for Text Summarization

Applying BERT to Other Languages

Exploring Sentence and Domain-Specific BERT

Working with VideoBERT, BART, and More

Conclusion

FinBERT for Finnish

Configurations of FinBERT