Natural Language Processing with TensorFlow/

...

Implementing the Language Model

Learn how to define the layers and model in the implementation.

We'll cover the following...

First, we’ll discuss the hyperparameters that are used for the LSTM and their effects.

Thereafter, we’ll discuss the parameters (weights and biases) required to implement the LSTM. We’ll then discuss how these parameters are used to write the operations taking place within the LSTM. This will be followed by learning how we’ll sequentially feed data to the LSTM. Next, we’ll discuss how to train the model. Finally, we’ll investigate how we can use the learned model to output predictions, which are essentially bigrams that will eventually add up to a meaningful story.

Defining the `TextVectorization` layer

We discussed the TextVectorization layer. We’ll be using the same text vectorization mechanism to tokenize text. In summary, the TextVectorization layer provides us with a convenient way to integrate text tokenization (i.e., converting strings into a list of tokens that are represented by integer IDs) into the model as a layer.

Here, we’ll define a TextVectorization layer to convert the sequences of n-grams to sequences of integer IDs:

Note that we are defining several important arguments, such as the max_tokens (size of the vocabulary), the standardize argument to not perform any text preprocessing, the split argument to not perform any splitting, and finally, the input_shape argument to inform the layer that the input will be a batch of sequences of n-grams. With that, we have to train the text vectorization layer to recognize the available n-grams and map them to unique IDs. We can simply pass our training tf.data pipeline to this layer to learn the n-grams.

Implementing the Language Model

Defining the TextVectorization layer

Defining the `TextVectorization` layer