M-BERT Generalization

Learn how the M-BERT model generalizes across different scripts and typological features.

Generalization across scripts

Let's investigate whether the M-BERT model can generalize across languages that are written in different scripts. Let's conduct a small experiment to understand this. Say we are performing a POS (part of speech) tagging task. First, we fine-tune M-BERT for the POS Part-of-speechtagging task in the Urdu language. Then, we evaluate the fine-tuned M-BERT model in a different language, say Hindi. Note that Urdu follows Arabic script, while Hindi follows Devanagari script. A simple example of Urdu and Hindi text is given here:

Get hands-on with 1400+ tech skills courses.