What is ELU?

Exponential Linear Unit (ELU) is an activation function that improves a model's accuracy and reduces the training time. It is mathematically represented as follows:

In the formula above, α \alpha is usually set to 1.0. It determines the saturation level of the negative inputs.

Need for ELU

The activation function ReLU became famous when it solved the problem of the vanishing gradient (namely, the gradients of activation functions, like sigmoids, become very small making it difficult to train bigger models).

At the same time, however, ReLU created a problem for itself, called the dying ReLU problem. This problem occurs when ReLU outputs 0 on any input.

In contrast, ELU (like batch normalization) has negative values that help bring the mean value closer to 0. This improves the training speed.

Even though Parametric ReLU and Leaky ReLU also have negative values, they are not smooth functions. ELU is a smooth function for negative values, making it more noise-robust.

Code

Here, we implement ELU in Python:

import numpy as np
import matplotlib.pyplot as plt
# initializing the constant
α = 1.0
def ELU(x):
if x > 0:
return x
return α*(np.exp(x) - 1)
x = np.linspace(-5.0, 5.0)
result = []
for i in x:
result.append(ELU(i))
plt.plot(x, result)
plt.title("ELU activation function")
plt.xlabel("Input")
plt.ylabel("Output")
plt.grid(True)
plt.savefig('output/elu_plot.png')

Code explanation

  • Lines 9–10: We implement the equation mentioned above.
  • Line 12: We use np.linspace to generate evenly spaced numbers between 5.0-5.0 and 5.05.0. By default, it generates a total of 5050 numbers.
  • Lines 17–22: We use the matplotlib library to plot the output of ELU over the given range.

Copyright ©2024 Educative, Inc. All rights reserved