Nesterov Accelerated Gradient (NAG)
Learn how Nesterov Accelerated Gradient (NAG) can be used to escape local optimum in non-convex optimization.
We'll cover the following
What is NAG?
Consider the scenario where a company wants to determine the optimal production rate and the optimal selling price for one of its products to maximize profit, which is given by a non-convex objective having several local optimums.
NAG is a variant of gradient descent with momentum that improves the convergence rate and the stability of gradient descent. The main idea is to use a look-ahead term to calculate the gradient at a future point rather than the current point. This way, the algorithm can anticipate the direction of the optimal solution and avoid overshooting or oscillating. The figure below illustrates the idea:
Get hands-on with 1400+ tech skills courses.