Text Generation and the Magic of LSTMs

Learn how to transform the textual data into suitable vector forms for training and inference purposes.

In this section, we'll leverage this understanding of text representation to work our way toward building text generation models.

So far, we have built models using feedforward networks consisting of different kinds and combinations of layers. These networks work with one training example at a time, which is independent of other training samples. We say that the samples are independent and identically distributed, or IID. Language, or text, is a bit different.

Words change their meaning based on the context they are being used in. In other words, if we were to develop and train a language generation model, we would have to ensure the model understands the context of its input.

Recurrent neural networks (RNNs) are a class of neural networks that allow previous outputs to be used as inputs, along with memory or hidden units. This awareness of previous inputs helps in capturing context and provides us with the ability to handle variable-length input sequences (sentences are hardly ever of the same length). A typical RNN is depicted in the following diagram, in both actual and unrolled form:

Get hands-on with 1400+ tech skills courses.