...

/

Classifying Text with a Character-level LSTM

Classifying Text with a Character-level LSTM

Let's see how to train a character-level LSTM.

We'll cover the following...

We'll now train a character-level LSTM architecture for recognizing the intents. We already practiced text classification with TensorFlow and Keras. Recall that LSTMs are sequential models that process one input at one time step. We fed one word at each time step as follows:

Press + to interact
Feeding one word to an LSTM at each time step
Feeding one word to an LSTM at each time step

As we remarked the "Text Classification with spaCy" lesson, LSTMs have an internal state (we can think of it as a memory), so LSTMs can model the sequential dependencies in the input sequence by holding past information in their internal state.

Here, we'll train a character-level LSTM. As the name suggests, we'll feed utterances character by character, not word by word. Each utterance will be represented as a sequence of characters. At each time step, we'll feed one character. This is what feeding the utterance from the previous illustration looks like:

Press + to interact
Feeding the first two words of the utterance "I want Italian food"
Feeding the first two words of the utterance "I want Italian food"

We notice that the space character is fed as an input as well because the space character is also a part of the utterance; for character-level tasks, there is no distinction between digits, spaces, and letters.

Let's start building the Keras model. We'll skip the data preparation stage here. We'll directly start with Keras' Tokenizer to create a vocabulary. Recall from "Text Classification with ...