How to implement the softmax function in Python

As shown above, the softmax function accepts a vector z of length K. For each value in z, the softmax function applies the standard exponential function to the value. It then divides it by the sum of the exponents of each value in z.

Example

Consider the following vector:

z = [5, 2, 8]

First, let’s calculate the exponential of each value in z.

$e^{z_{1}}$ = $e^{5}$ = 148.4

$e^{z_{2}}$ = $e^{2}$ = 7.4

$e^{z_{3}}$ = $e^{8}$ = 2981.0

Next, we can calculate the sum of the exponentials:

${\sum_{j=1}^K e^{z_{j}}}$ = $e^{z_{1}} + e^{z_{2}} + e^{z_{3}}$

${\sum_{j=1}^K e^{z_{j}}}$ = 148.4 + 7.4 + 2981.0 = 3136.8

Finally, we can calculate the softmax equivalent for each value in z, as shown below:

$σ$ ( $z_{1}$ ) = $\frac{148.4}{3136.8}$ = 0.0473

$σ$ ( $z_{2}$ ) = $\frac{7.4}{3136.8}$ = 0.0024

$σ$ ( $z_{3}$ ) = $\frac{2981.0}{3136.8}$ = 0.9503

So, we end up with a vector of probabilities:

Softmax(z) = [0.0473, 0.0024, 0.9503]

Code

The code below shows how to implement the softmax function in Python:

import math
# softmax function
def softmax(z):
	# vector to hold exponential values
	exponents = []
	# vector to hold softmax probabilities
	softmax_prob = []
	# sum of exponentials
	exp_sum = 0
	# for each value in the input vector
	for value in z:
		# calculate the exponent
		exp_value = math.exp(value)
		# append to exponent vector
		exponents.append(exp_value)
		# add to exponential sum
		exp_sum += exp_value
	
	# for each exponential value
	for value in exponents:
		# calculate softmax probability
		probability = value / exp_sum
		# append to probability vector
		softmax_prob.append(probability)
	
	return softmax_prob
# define vector
z = [5, 2, 8]
# find softmax
result = softmax(z)
print(result)

Explanation

In the code above:

Line 1: We import the math library.
Line 4: We define the softmax function that accepts a vector as a parameter.
Lines 7-13: We declare three variables to store the exponential of each value, the corresponding probability, and the sum of all exponentials, respectively.
Lines 16-25: We use a for-loop to iterate over each value in the given array. We first calculate its exponential for each value through the math.exp() function, and append the value to exponents. The sum of exponentials is also updated in each iteration of the loop.
Lines 28-34: We use another for-loop to find the probability corresponding to each exponential value by dividing the value by exp_sum. Each probability is appended to softmax_prob.
Lines 39-43: We declare a vector z containing three values and pass it to the softmax function. The vector returned by the function is output accordingly.

Free Resources

License: Creative Commons-Attribution-ShareAlike 4.0 (CC-BY-SA 4.0)

Learn in-demand tech skills in half the time

PRODUCTS

Mock Interview

New

Courses

Skill Paths

Projects

Assessments

How to implement the softmax function in Python

Overview

$\sigma (\vec{z})_{i} = \frac{{e}^{z_{i}}}{\sum_{j=1}^{K}{e}^{z_{j}}}$

Example

Code

Explanation

How to implement the softmax function in Python

Overview

σ(z⃗)i=ezi∑j=1Kezj\sigma (\vec{z})_{i} = \frac{{e}^{z_{i}}}{\sum_{j=1}^{K}{e}^{z_{j}}}σ(z)i​=∑j=1K​ezj​ezi​​

Example

Code

Explanation

$\sigma (\vec{z})_{i} = \frac{{e}^{z_{i}}}{\sum_{j=1}^{K}{e}^{z_{j}}}$