The scipy
library allows us to find the minimum value of an objective function—a real-valued function that is to be minimized or maximized—using the scipy.optimize.minimize()
function. But why do we need to find the minimum value in the first place? Determining the minimum value is a common problem in machine learning and data science. For example, we might want to cut down costs. In such a case, we’ll determine an equation that outputs costs, and to minimize it, we just need to fetch the minimum value of this equation.
Let’s look at how to make the relevant imports to start using the minimize()
function.
To begin using the scipy.optimize.minimize()
function, we need to install Python and its libraries numpy
and scipy
. Then, we need to import the scipy.optimize
module. There are two ways of importing the minimize()
function. We can directly import it from scipy.optimize
or reference it from optimize
.
from scipy.optimize import minimizeimport scipy.optimize as optimize
Next, we select a function to minimize. For this answer, we’ll work on minimizing a quadratic function.
minimize()
functionLet’s go over the various arguments that can be passed to the minimize()
function:
We pass the objective function to be minimized as the first argument. The objective function’s parameters will be a 1-D array.
The second argument is the initial guess, an ndarray of shape (n,)
where n
is the count of independent variables. An initial guess is required for the function to begin exploring the function space until it converges to the actual minimum.
The method
argument specifies the type of solver required to determine the minimization problem. For this answer, we’ll play around with seven different methods to fetch our minimum value.
CG
: This is the conjugate gradient method to determine the minimum value of the objective function.
BFGS
: The BFGS
uses the second-order derivative of the objective function to find the minimum.
Newton-CG
: The Newton conjugate gradient method is the best of both worlds. It combines the Newton method’s ability to determine a function’s lowest value and the conjugate gradient’s method to reach a function’s lowest optimally.
L-BFGS-B
: The L-BFGS-B
algorithm, an extension of the BFGS
method, also relies on the second-order derivative, but it saves memory by only saving a few vectors, therefore the name limited memory BFGS.
TNC
: The truncated Newton conjugate gradient algorithm is implemented in this case. It calculates a portion of the Hessian matrix, which is less computationally expensive.
COBYLA
: This method uses a linear approximation of both the objective and constraint functions.
SLSQP
: Sequential least squares programming (SLSQP
) minimizes a multivariate function without any constraints, but we’ll use a single variable function in our example.
There are many optional arguments, and one of them is the options
dictionary. disp: True
means all convergence messages will be printed. We can also provide the maximum number of iterations using the maxiter
keyword.
scipy.optimize.minimize(objective_function, initial_guess,method """optional""", options={"disp":True} """optional""")
minimize()
function in actionHere’s a Jupyter Notebook implementing all the methods discussed above. You’re encouraged to tweak the code by providing some optional arguments and rerun the cell.
from scipy.optimize import minimize import scipy.optimize as optimize def educative_quadratic_function(x): educatives_return_value = (2*x**2) - (3*x) + 4 return educatives_return_value starting_ex_value = -1 function_name = ['CG','BFGS','Newton-CG', 'L-BFGS-B','TNC','COBYLA','SLSQP'] educatives_result = optimize.minimize(educative_quadratic_function, starting_ex_value,options={"disp":True}) if educatives_result.success: print(f"value of x = {educatives_result.x} value of y = {educatives_result.fun}") else: print("No minimum value") educatives_lowest = minimize(educative_quadratic_function,starting_ex_value, method=function_name[0], options={"disp" : True}) print(educatives_lowest) educatives_lowest = minimize(educative_quadratic_function,starting_ex_value, method=function_name[1], options={"disp" : True}) print(educatives_lowest) def gradient(x): gradient = (4*x) -3 return gradient educatives_lowest = minimize(educative_quadratic_function,starting_ex_value, method=function_name[2], jac=gradient, options={"disp":True}) print(educatives_lowest) def get_my_scalar(x): scalar_value =educative_quadratic_function(x[0]) return scalar_value educatives_lowest = minimize(get_my_scalar,starting_ex_value, method=function_name[3]) print(educatives_lowest) educatives_lowest = minimize(educative_quadratic_function,starting_ex_value, method=function_name[4], options={"disp":True}) print(educatives_lowest) educatives_lowest = minimize(educative_quadratic_function,starting_ex_value, method=function_name[5], options={"disp":True}) print(educatives_lowest) educatives_lowest = minimize(educative_quadratic_function,starting_ex_value, method=function_name[6],options={"disp":True}) print(educatives_lowest)
Lines 1–2: We import the minimize
function from the scipy.optimize
module. We use the alias optimize
for scipy.optimze
so that it’s shorter to write throughout the code.
Lines 4–6: We declare a quadratic function 2*x**2 - 3*x + 4
that returns the corresponding value of
Lines 8–14: We set the starting value of starting_ex_value
as -1
. We define a list called function_name
containing all the function names that we’ll pass as arguments to the minimize()
function. These functions have been discussed in the preceding section. The first time we call the optimize.minimize()
function on line 10, we pass it the objective function’s name, educative_quadratic_function
, the initial value of x, starting_ex_value
, and a dictionary of optional arguments options={"disp":True}
. This would allow all convergence messages to be displayed. The minimize()
function returns three values: a solution array, a boolean flag called sucess
, and message
which contains the reason for termination. The return value success
indicates if the optimizer returned successfully. If success
is equal to True
, we display the values of No minimum value
.
Lines 16–20: Next, we call the optimize()
function twice, passing it an additional keyword argument method=function_name[0]
. The function_name
array, as mentioned before, contains the names of all the functions to be passed as arguments. First, we’ll pass the CG
function and then the BFGS
functions as arguments.
Lines 22–26: We define a function that calculates the gradient of our quadratic function for any value of jac
, short for Jacobian, on line 25.
Lines 28–30: We also define a function get_my_scalar()
, which returns a scalar value for the corresponding value for
Lines 32–42: We pass the starting value of L-BFGS-B
as method
to the minimize()
function. Lastly, we’ll call the minimize()
function for the remaining methods, namely, TNC
, COBYLA
, and SLSQP
.
Refer to the following answers for further motivation:
Free Resources