What is Leaky ReLU?

Leaky ReLU is an activation function used in artificial neural networks to introduce nonlinearity among the outputs between layers of a neural network. This activation function was created to solve the dying ReLU problem using the standard ReLU function that makes the neural network die during training.

Mathematical function

Below, we can see the mathematical representation of the Leaky ReLU function.

In the equation, we see that for every parameter xx, the Leaky ReLU function will return the maximum value between xx or alphaxalpha * x, where alphaalpha is a small positive constant.

Using this function, we can convert negative values to make them close to 0 but not actually 0, solving the dying ReLU issue that arises from using the standard ReLU function during neural network training.

Note: If we set the value of alpha to 0 this function will act as the standard ReLU function.

Function working

Now let's look at an example to see how the function works in mapping the outputs of the neural network layer with a pictorial representation.

Leaky ReLU mathematical working
Leaky ReLU mathematical working

In the diagram, we see that the function receives a list of values that can either be positive or negative. All the negative values will be multiplied by the constant alphaalpha so they become close to zero.

Graphical representation

Now that we have looked at the mathematical notation for the Leaky ReLU function, let's look at the function's graph to understand it further.

Graph of the Leaky ReLU function
Graph of the Leaky ReLU function

Let's suppose we use a small constant value for the variable alphaalpha. In that case, all negative values on the x-axis representing the input to the function get mapped close to zero while the positive values remain unchanged.

Note: The value of the constant (alphaalpha) is determined before training, i.e. it is not learned during training.

Conclusion

The Leaky ReLU is a popular activation function that is used to address the limitations of the standard ReLU function in deep neural networks by introducing a small negative slope for negative function inputs, which helps neural networks to maintain better information flow both during its training and after.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved