LSTM Networks

You’ll learn about LSTM Networks in this lesson.

LSTM Networks

LSTM stands for Long Short Term Memory and they are a special kind of RNN. They were introduced by Sepp Hochreiter and Jürgen Schmidhuber in 1997. They have proven to work very well on a large variety of problems in the field. They overcome the limitations of Naïve RNNs which fail to deal with long term dependencies in the sequences. In a simple RNN we have repeating blocks as shown in the figure below. This RNNcontains a single tanh layer.

We have the same chain like structure in LSTMs but the repeating module has a different structure in LSTMs. You can see this in the diagram below.

In the above diagram we have the following representations.

...