Multilingual Sentence-BERT Model

Learn about the different pre-trained multilingual models that are available and how to use them compute the similarity between two sentences in different languages.

We learned how to make the monolingual model multilingual through knowledge distillation. Now let's learn how to use the pre-trained multilingual model. The researchers have made their pre-trained models publicly available with the sentence-transformers library. So, we can directly download a pre-trained model and use it for our task.

Pre-trained multilingual models

The available pre-trained multilingual models are as follows:

  • distiluse-base-multilingual-cased: This supports Arabic, Chinese, Dutch, English, French, German, Italian, Korean, Polish, Portuguese, Russian, Spanish, and Turkish.

  • xlm-r-base-en-ko-nli-ststb: This supports Korean and English.

  • xlm-r-large-en-ko-nli-ststb: This supports Korean and English.

Now, let's learn how to use these pre-trained models.

Using the multilingual model

Let's see how to compute the similarity between two sentences in a different language. First, let's import the SentenceTransformer module:

Get hands-on with 1400+ tech skills courses.