Steepest Descent

Learn how to find the optimal step size in gradient descent using the method of steepest descent.

The method of steepest descent

So far, in gradient descent, we use the step size α\alpha to control the amount of descent we like to perform at a particular point. Choosing a large step size makes the algorithm unstable, whereas choosing a small step size requires more iterations to reach convergence.

Steepest descent works by finding the optimal step size for the gradient descent. For a convex function f(x)f(x), the steepest descent update at a time t>0t>0 can be written as follows:

Get hands-on with 1200+ tech skills courses.