Classifying Text with a Character-level LSTM
Let's see how to train a character-level LSTM.
We'll cover the following...
We'll now train a character-level LSTM architecture for recognizing the intents. We already practiced text classification with TensorFlow and Keras. Recall that LSTMs are sequential models that process one input at one time step. We fed one word at each time step as follows:
As we remarked the "Text Classification with spaCy" lesson, LSTMs have an internal state (we can think of it as a memory), so LSTMs can model the sequential dependencies in the input sequence by holding past information in their internal state.
Here, we'll train a character-level LSTM. As the name suggests, we'll feed utterances character by character, not word by word. Each utterance will be represented as a sequence of characters. At each time step, we'll feed one character. This is what feeding the utterance from the previous illustration looks like:
We notice that the space character is fed as an input as well because the space character is also a part of the utterance; for character-level tasks, there is no distinction between digits, spaces, and letters.
Let's start building the Keras model. We'll skip the data preparation stage here. We'll directly start with Keras' Tokenizer to create a vocabulary. Recall from "Text Classification with ...