What is the scipy.optimize.minimize() function?

The scipy library allows us to find the minimum value of an objective function—a real-valued function that is to be minimized or maximized—using the scipy.optimize.minimize() function. But why do we need to find the minimum value in the first place? Determining the minimum value is a common problem in machine learning and data science. For example, we might want to cut down costs. In such a case, we’ll determine an equation that outputs costs, and to minimize it, we just need to fetch the minimum value of this equation.

Getting started

Let’s look at how to make the relevant imports to start using the minimize() function.

Imports

To begin using the scipy.optimize.minimize() function, we need to install Python and its libraries numpy and scipy. Then, we need to import the scipy.optimize module. There are two ways of importing the minimize() function. We can directly import it from scipy.optimize or reference it from optimize.

Applying the `minimize()` function

Let’s go over the various arguments that can be passed to the minimize() function:

We pass the objective function to be minimized as the first argument. The objective function’s parameters will be a 1-D array.
The second argument is the initial guess, an ndarray of shape (n,) where n is the count of independent variables. An initial guess is required for the function to begin exploring the function space until it converges to the actual minimum.
The method argument specifies the type of solver required to determine the minimization problem. For this answer, we’ll play around with seven different methods to fetch our minimum value.
- CG: This is the conjugate gradient method to determine the minimum value of the objective function.
- BFGS: The BFGS uses the second-order derivative of the objective function to find the minimum.
- Newton-CG: The Newton conjugate gradient method is the best of both worlds. It combines the Newton method’s ability to determine a function’s lowest value and the conjugate gradient’s method to reach a function’s lowest optimally.
- L-BFGS-B: The L-BFGS-B algorithm, an extension of the BFGS method, also relies on the second-order derivative, but it saves memory by only saving a few vectors, therefore the name limited memory BFGS.
- TNC: The truncated Newton conjugate gradient algorithm is implemented in this case. It calculates a portion of the Hessian matrix, which is less computationally expensive.
- COBYLA: This method uses a linear approximation of both the objective and constraint functions.
- SLSQP: Sequential least squares programming (SLSQP) minimizes a multivariate function without any constraints, but we’ll use a single variable function in our example.
There are many optional arguments, and one of them is the options dictionary. disp: True means all convergence messages will be printed. We can also provide the maximum number of iterations using the maxiter keyword.

from scipy.optimize import minimize
import scipy.optimize as optimize

def educative_quadratic_function(x):
  educatives_return_value = (2*x**2) - (3*x) + 4
  return educatives_return_value

starting_ex_value =  -1
function_name = ['CG','BFGS','Newton-CG', 'L-BFGS-B','TNC','COBYLA','SLSQP']
educatives_result = optimize.minimize(educative_quadratic_function, starting_ex_value,options={"disp":True})
if educatives_result.success:
    print(f"value of x = {educatives_result.x} value of y = {educatives_result.fun}")
else:
    print("No minimum value")
    
educatives_lowest = minimize(educative_quadratic_function,starting_ex_value, method=function_name[0], options={"disp" : True})
print(educatives_lowest)

educatives_lowest = minimize(educative_quadratic_function,starting_ex_value, method=function_name[1], options={"disp" : True})
print(educatives_lowest)

def gradient(x):
    gradient = (4*x) -3
    return gradient
educatives_lowest = minimize(educative_quadratic_function,starting_ex_value, method=function_name[2], jac=gradient,  options={"disp":True})
print(educatives_lowest)

def get_my_scalar(x):
    scalar_value =educative_quadratic_function(x[0])
    return scalar_value

educatives_lowest = minimize(get_my_scalar,starting_ex_value, method=function_name[3])
print(educatives_lowest)

educatives_lowest = minimize(educative_quadratic_function,starting_ex_value, method=function_name[4], options={"disp":True})
print(educatives_lowest)

educatives_lowest = minimize(educative_quadratic_function,starting_ex_value, method=function_name[5], options={"disp":True})
print(educatives_lowest)

educatives_lowest = minimize(educative_quadratic_function,starting_ex_value, method=function_name[6],options={"disp":True})
print(educatives_lowest)

Code example for the scipy.optimize.minimize() function

Code explanation

Lines 1–2: We import the minimize function from the scipy.optimize module. We use the alias optimize for scipy.optimze so that it’s shorter to write throughout the code.
Lines 4–6: We declare a quadratic function 2*x**2 - 3*x + 4 that returns the corresponding value of $y$ for a given value of $x$ .
Lines 8–14: We set the starting value of $x$ , called starting_ex_value as -1. We define a list called function_name containing all the function names that we’ll pass as arguments to the minimize() function. These functions have been discussed in the preceding section. The first time we call the optimize.minimize() function on line 10, we pass it the objective function’s name, educative_quadratic_function, the initial value of x, starting_ex_value, and a dictionary of optional arguments options={"disp":True}. This would allow all convergence messages to be displayed. The minimize() function returns three values: a solution array, a boolean flag called sucess, and message which contains the reason for termination. The return value success indicates if the optimizer returned successfully. If success is equal to True, we display the values of $x$ and $y$ for the point of convergence. Otherwise, we display No minimum value.
Lines 16–20: Next, we call the optimize() function twice, passing it an additional keyword argument method=function_name[0]. The function_name array, as mentioned before, contains the names of all the functions to be passed as arguments. First, we’ll pass the CG function and then the BFGS functions as arguments.
Lines 22–26: We define a function that calculates the gradient of our quadratic function for any value of $x$ . We require the gradient to pass as a value to the keyword argument jac , short for Jacobian, on line 25.
Lines 28–30: We also define a function get_my_scalar(), which returns a scalar value for the corresponding value for $x$ .
Lines 32–42: We pass the starting value of $x$ , and L-BFGS-B as method to the minimize() function. Lastly, we’ll call the minimize() function for the remaining methods, namely, TNC, COBYLA, and SLSQP.