LSTM Variants and Convolutions for Text

Learn about two popular variants of the single-layer LSTM networks—stacked and bidirectional LSTMs.

RNNs are extremely useful when it comes to handling sequential datasets. A simple model can effectively learn to generate text based on what it learned from the training dataset.

Over the years, there have been a number of enhancements in the way we model and use RNNs. In this section, we’ll discuss two widely used variants of the single-layer LSTM network we discussed earlier: stacked and bidirectional LSTMs.

Stacked LSTMs

We are well aware of how the depth of a neural network helps it learn complex and abstract concepts when it comes to computer vision tasks. Along the same lines, a stacked LSTM architecture, which has multiple layers of LSTMs stacked one after the other, has been shown to give considerable improvements. Stacked LSTMs were first presented by Graves et al. in their work “Speech Recognition with Deep Recurrent Neural NetworksGraves, Alex, Abdel-rahman Mohamed, and Geoffrey Hinton. 2013. “Speech Recognition with Deep Recurrent Neural Networks.” ArXiv.org. 2013. https://arxiv.org/abs/1303.5778..” They highlight the fact that depth—multiple layers of RNNs—has a greater impact on performance compared to the number of units per layer.

Get hands-on with 1400+ tech skills courses.