Traditional neural networks mainly have independent input and output layers, which make them inefficient when dealing with sequential data. Hence, a new neural network called Recurrent Neural Network was introduced to store results of previous outputs in the internal memory. These results are then fed into the network as inputs. This allows it to be used in applications like pattern detection, speech and voice recognition, natural language processing, and time series prediction.
RNN has hidden layers that act as memory locations to store the outputs of a layer in a loop.
The four commonly used types of Recurrent Neural Networks are:
The simplest type of RNN is One-to-One, which allows a single input and a single output. It has fixed input and output sizes and acts as a traditional neural network. The One-to-One application can be found in Image Classification.
One-to-Many is a type of RNN that gives multiple outputs when given a single input. It takes a fixed input size and gives a sequence of data outputs. Its applications can be found in Music Generation and Image Captioning.
Many-to-One is used when a single output is required from multiple input units or a sequence of them. It takes a sequence of inputs to display a fixed output. Sentiment Analysis is a common example of this type of Recurrent Neural Network.
Many-to-Many is used to generate a sequence of output data from a sequence of input units.
This type of RNN is further divided into the following two subcategories:
1. Equal Unit Size: In this case, the number of both the input and output units is the same. A common application can be found in Name-Entity Recognition.
2. Unequal Unit Size: In this case, inputs and outputs have different numbers of units. Its application can be found in Machine Translation.
Some of the benefits provided by Recurrent Neural Networks are:
Processes sequential data
Can memorize and store previous results
Takes into account both the current and the previous results in the computation of new results
Regardless of the increasing size of the input, the model size remains fixed
It shares weights to other units across time
Below are some of the limitations of Recurrent Neural Networks:
The computation time is slow as it is recurrent.
Unable to process a long sequence of information if using tanh
or ReLU
activation functions.
Cannot process future data in computation of current data.
Training is complicated.
Exploding Gradient: An exponential increase in model weights occur due to an accumulation of large gradient errors.
Vanishing Gradient: The gradients become too small and unable to make significant changes in the model weights.