RuBERT for Russian
Explore how to utilize RuBERT, a monolingual BERT model for Russian, that transfers knowledge from multilingual BERT. Understand its training process using Russian text, subword tokenization, and embedding initialization from M-BERT to effectively represent Russian sentences in NLP tasks.
We'll cover the following...
RuBERT is the pre-trained BERT for the Russian language. RuBERT is trained differently from other BERT variants.
Pre-training the RuBERT model
RuBERT is trained by transferring knowledge from M-BERT. We know that M-BERT is trained on Wikipedia text of 104 languages and has good knowledge of each language. So, instead of training the monolingual RuBERT from scratch, we train it by obtaining knowledge from M-BERT. Before training, we initialize all the parameters of RuBERT with the parameters of the M-BERT model, except the word embeddings. ...