Leaky ReLU is an activation function used in artificial neural networks to introduce nonlinearity among the outputs between layers of a neural network. This activation function was created to solve the dying ReLU problem using the standard ReLU function that makes the neural network die during training.
Below, we can see the mathematical representation of the Leaky ReLU function.
In the equation, we see that for every parameter
Using this function, we can convert negative values to make them close to 0 but not actually 0, solving the dying ReLU issue that arises from using the standard ReLU function during neural network training.
Note: If we set the value of alpha to 0 this function will act as the standard ReLU function.
Now let's look at an example to see how the function works in mapping the outputs of the neural network layer with a pictorial representation.
In the diagram, we see that the function receives a list of values that can either be positive or negative. All the negative values will be multiplied by the constant
Now that we have looked at the mathematical notation for the Leaky ReLU function, let's look at the function's graph to understand it further.
Let's suppose we use a small constant value for the variable
Note: The value of the constant (
) is determined before training, i.e. it is not learned during training.
The Leaky ReLU is a popular activation function that is used to address the limitations of the standard ReLU function in deep neural networks by introducing a small negative slope for negative function inputs, which helps neural networks to maintain better information flow both during its training and after.
Free Resources