Introduction: Machine Translation with the Transformer

Learn what to expect when performing machine translation using transformers and the steps required to do so.

We'll cover the following

Humans easily master sequence transduction, transferring a representation to another object. We can easily imagine a mental representation of a sequence. If somebody says, ‘The flowers in my garden are beautiful,’ we can easily visualize a garden with flowers in it. We see images of the garden, although we might never have seen that garden. We might even imagine chirping birds and the scent of flowers.

A machine must learn transduction from scratch with numerical representations. Recurrent or convolutional approaches have produced interesting results but have not reached significant BLEU translation evaluation scores. Translating requires the representation of language A transposed into language B.

The transformer model’s self-attention innovation increases the analytic ability of machine intelligence. A sequence in language A is adequately represented before attempting to translate it into language B. Self-attention brings the level of intelligence required by a machine to obtain better BLEU scores.

The seminal Attention Is All You Need transformer obtained the best results for English-German and English-French translations in 2017. Since then, the scores have been improved by other transformers.

Chapter overview


At this point in the course, we have covered the essential aspects of transformers: the architecture of the transformer, training a RoBERTa model from scratch, fine-tuning a BERT, evaluating a fine-tuned BERT, and exploring downstream tasks with some transformer examples.

In this chapter, we will go over machine translation. We will first define what machine translation is. We will then preprocess a Workshop on Machine Translation (WMT) dataset. Finally, we will see how to implement machine translations.

Get hands-on with 1200+ tech skills courses.