Natural Language Processing with TensorFlow/

...

Defining the NMT Model

Learn about the encoder and the decoder for the NMT model.

We'll cover the following...

Converting tokens to IDs
Defining the encoder
Defining the decoder

In this lesson, we’ll define the model from end to end. We are going to implement an encoder-decoder based NMT model equipped with additional techniques to boost performance. Let’s start off by converting our string tokens to IDs.

Converting tokens to IDs

Before we jump to the model, we have one more text processing operation remaining, that is, converting the processed text tokens into numerical IDs. We’re going to use a tf.keras.layers.Layer to do this. Particularly, we’ll be using the StringLookup layer to create a layer in our model that converts each token into a numerical ID. As the first step, let’s load the vocabulary files provided in the data. But before doing so, we’ll define the variable n_vocab to denote the size of the vocabulary for each language:

Originally, each vocabulary contains 50,000 tokens. However, we’ll take only half of this to reduce the memory requirement. Note that we allow one extra token because there’s a special token <unk> to denote out-of-vocabulary (OOV) words. With a 50,000-token vocabulary, it’s quite easy to run out of memory due to the size of the final prediction layer we’ll build. While cutting back on the size of the vocabulary, we have to make sure that we preserve the most common 25,000 words. Fortunately, each vocabulary file is organized so that words are ordered by their frequency of occurrence (high to low). Therefore, we just need to read the first 25,001 lines of text from the file:

Introduction to Natural Language Processing

Understanding TensorFlow 2

Word2vec: Learning Word Embeddings

Advanced Word Vector Algorithms

Sentence Classification with Convolutional Neural Networks

Recurrent Neural Networks

Understanding Long Short-Term Memory Networks

Applications of LSTM: Generating Text

Sequence-to-Sequence Learning: Neural Machine Translation

Transformers

Image Captioning with Transformers

Final Remarks

Appendix: Mathematical Foundations and Advanced TensorFlow

Defining the NMT Model

Converting tokens to IDs