Neural networks are used in several applications like computer vision, natural language processing, speech processing, and so on. The performance of neural networks has improved considerably over the past decade. Only a well-trained neural network can perform well on the test dataset. Many beginners face problems in the training process of the neural network.
In gradient-based learning algorithms, we use gradients to learn the weights of a neural network. It works like a chain reaction as the gradients closer to the output layers are multiplied with the gradients of the layers closer to the input layers. These gradients are used to update the weights.
If the gradients are large, the multiplication of these gradients will become huge over time. This results in the model being unable to learn and its behavior becomes unstable. This problem is called the exploding gradient problem.
Let's consider an overly simplified example. Suppose we have a 20-layer neural network, and each layer has only one neuron in each layer.
Here,
The output of the first neuron is calculated as the following:
This output will be fed as an input to the neuron in the second layer, and so on.
As we can observe, there is chaining going on with the outputs of the neurons. If we calculate the output with respect to the very first weight, it would be tedious and hard to keep track of.
When the gradient with respect to weight is high, there would be a greater change in the weight. For the sake of understanding, let's say that the value of all the weights is the same (
This is a huge value. Imagine how this would scale for a 50-layered network.
There are some key rules that can help identify whether or not the gradient is exploding. These are as follows:
The model is not performing well on the training data.
There are large changes in the learning loss (unstable learning).
The loss becomes NaN.
Some of the suggested solutions to tackle the exploding gradient problem are given below:
Use
Use less number of layers
Carefully initialize weights
Free Resources