The Motivation for Optimization
Learn about optimization, its mathematical framework, and taxonomy.
We'll cover the following
What is optimization?
Let’s assume a machine learning problem, where we have a model
Here,
The process of finding the optimal parameter
where
A real-world example
Let’s consider a real-world example where we want to create a soda can bottle (cylindrical shape) that has a volume of 500 ml (1 ml = 1 cubic cm) with the least amount of material possible. The material cost is
Mathematically, we can represent the problem above as the following optimization problem:
where
To solve the problem above, we can substitute
The figure given below shows the graph of the objective
The effect of constraints on an optimal solution
Let’s take an example where
Note: The solutions
and result in and for the equation .
Local vs. global optimal solution
The solution to an optimization problem—by default, we assume that it is a minimization problem—can be of the following two types:
Local optimum: A solution
is said to be a local optimum if , where . In other words, the value of the objective function increases in all directions around . Global optimum: A solution
is said to be a global optimum if it is a local optimum and for all .
Let’s consider the graph
The local optima—red—are houses with competitive prices within their immediate neighborhoods but aren’t the cheapest citywide. However, the global optimum—green—is the absolute best deal across all neighborhoods, offering the lowest price while meeting essential criteria.
Types of optimization problems
When
Linear programming (LP)
When both the objective function and the constraints are linear, the optimization problem is said to be a linear programming (LP) problem. The general form of LP is given as follows:
where
For example, let’s consider a diet problem where the task is to find the optimal combination of foods that satisfy the nutritional requirements of a person at a minimum cost. Let’s say we have the following four foods to choose from:
Food | Cost per Unit ($) | Calories per Unit | Protein per Unit (g) | Fat per Unit (g) |
Rice | 0.12 | 130 | 2.7 | 0.3 |
Beans | 0.18 | 120 | 7.5 | 0.5 |
Milk | 0.23 | 150 | 8 | 8 |
Bread | 0.05 | 75 | 2 | 1 |
We want to calculate the following nutritional requirements per day:
Calories: At least 2000, i.e.,
. Protein: At least 55 g, i.e.,
. Fat: At most 70 g, i.e.,
.
In these relations,
The objective function is the total cost of the foods, which is given by the following:
Here,
The constraints can also be written in a matrix form as follows:
where
Quadratic programming (QP)
The optimization problem is a quadratic programming (QP) problem when the objective function is quadratic, but the constraints are linear. The general form of QP is given as follows:
where
As an example, let’s consider a portfolio optimization problem where the task is to find the best allocation of assets—such as stocks, bonds, or cash—that maximizes the expected return and minimizes the risk of a portfolio. Let’s say we have the following three types of assets to choose from:
Stocks that give an expected return of
but have a higher risk component (or variance) of Bonds that give an expected return of
and have a medium risk component (or variance) of Cash that gives an expected return of
and has a low-risk component (or variance) of
Considering that we want to assign the
With these constraints, we need to find the asset allocation that yields the minimum risk
Note:
and implies that .
This can be written as a QP problem as follows:
subject to
where
Integer linear programming (ILP)
The optimization problem is said to be an integer linear programming (ILP) problem when both the objective function and the constraints are linear, but the search space is restricted to the integers
where
As an example, let’s consider the knapsack problem, where the task is to pack a set of
where
Convex and non-convex optimization
When the objective function and the constraints are convex, the problem is said to be a convex optimization problem. We will discuss convex functions in detail later in the course, but for now, we can think of a convex function as a smiley/bowl-like structure, as shown in the figure below. Convex optimization problems only have one global solution. Both LP and QP are special cases of convex optimization problems.
When either the objective function or the constraints are non-convex, the problem is said to be a non-convex optimization problem. Deep neural network training is a non-convex optimization problem because of nonlinear functions, like sigmoid, ReLU, etc.
Types of optimization techniques
Based on the nature of the algorithm, optimization techniques can be broadly divided into the following types:
Gradient-free, brute-force, or heuristics: These techniques rely on iterating through the search space in a random or heuristic fashion to find the optimal point that minimizes the objective function while satisfying all the constraints, for example, random search, grid search, and the Nelder-Mead algorithm.
First-order gradient techniques: These techniques utilize gradients of the objective function to move in the direction where the objective function keeps decreasing.
Second-order (Newtonian) techniques: These techniques also utilize second-order gradients known as Hessians to provide additional information about the curvature and move in the direction where the objective function keeps decreasing.
Genetic algorithms: Genetic algorithms (GAs) are adaptive heuristic search algorithms that belong to the larger part of evolutionary algorithms. Genetic algorithms are based on the ideas of natural selection and genetics.