Gated RNN and NLP

Learn about RNN and NLP, along with their implementations.

As outlined earlier, the basic recurrent network has the form of a memory that takes earlier states into account. However, the influence of these states fades exponentially when the weight values are smaller than 11 (which they have to be) because otherwise, the recurrent influence would exponentially overwhelm the input. The basic RNN\text{RNN} is hence a form of short-term memory. Such a short-term memory is usually not sufficient for many applications. For example, in natural language processing, it’s necessary to take some context into account that might be remote relative to words at the current sequence position. Hence, some memories should only kick in at some appropriate time, which itself might be triggered by another word. It is thus important to gate some of these memories until they are useful at a later stage of processing.

Long short-term memory (LSTM)

The first network which has taken longer-term memory into consideration is called LSTM\text{LSTM}, which stands for long short-term memory. This network is illustrated in the figure below.

This gated network introduces another cell state c(t)c(t) which represents an intrinsic memory state. Its value is forwarded to the next time step and can be modified in each time step with two separate operations, a forgetting gate ftf_t and an input (write) gate iti_t. Hence, we have the following:

Get hands-on with 1400+ tech skills courses.