Quasi-Newton Methods

Learn how to approximate Newton’s methods when Hessians are difficult or expensive to compute.

Quasi-Newton methods are a class of second-order optimization algorithms that are particularly useful when the function’s Hessian is difficult to compute or not available.

Consider the update rule of the Newton algorithm at the time tt as follows:

Here, H(x)H(x) is the Hessian of the function f(x)f(x) at the point xx. Recall that the complexity of computing the Hessian is of the order O(m2)O(m^2) as opposed to the gradient, which is of the order O(m)O(m). This means that as the dimensionality (mm) of xx increases, the cost of computing the Hessian increases quadratically. Furthermore, the update rule in Newton’s method involves the inverse of the Hessian matrix, which can also become computationally expensive to calculate, especially for high-dimensional problems.

The key idea behind quasi-Newton methods is to approximate the inverse of Hessian using only the first-order derivative information, i.e., the gradient. This makes these methods more computationally efficient than Newton’s method, especially for problems with many variables. The update rule of quasi-Newton methods is ...