How Multilingual is Multilingual BERT?
Learn whether the multilingual knowledge transfer of M-BERT depends on the vocabulary overlap.
We'll cover the following
M-BERT is trained on the Wikipedia text of 104 different languages. M-BERT is also evaluated by fine-tuning it on the XNLI dataset. But how multilingual is our M-BERT? How is a single model able to transfer knowledge across multiple languages? To understand this, let's investigate the multilingual ability of M-BERT in more detail.
Effect of vocabulary overlap
M-BERT is trained on the Wikipedia text of 104 languages, and it consists of a shared vocabulary of 110k tokens. In this lesson, let's investigate whether the multilingual knowledge transfer of M-BERT depends on the vocabulary overlap.
Get hands-on with 1400+ tech skills courses.