Introduction to Deep Learning & Neural Networks/

...

Optimization and Gradient Descent

Learn about the fundamental algorithm behind machine learning training: gradient descent.

We'll cover the following...

Slope: the derivative of the loss function
Computing the gradient of a loss function
Summing up the training scheme

To find these weights, the core idea is to simply follow the slope of the curve. Although we don’t know the actual shape of the loss, we can calculate the slope in a point and then move towards the downhill direction.

You can think of the loss function as a mountain. The current loss gives us information about the local slope.

But what is the slope?

Slope: the derivative of the loss function

In calculus, the slope is the derivative of the function at this point and is denoted as $\frac{\partial w}{\partial x}$ . The ultimate goal would be to find the global min. The minimums, local or global, have a nearly zero derivative, which indicates that we are located at the minimum of the curve.

For now, suppose that we want to minimize the loss function $C$ . By calculating the derivative, we will take small steps along the slope in an iterative fashion. In this way, we can gradually reach the minimum of the curve.

The same principle can be extended into many dimensions $N$ . Despite the fact this is very difficult to visualize, maths is here to help us.

Learn Deep Learning

Neural Networks

Training Neural Networks

Convolutional Neural Networks

Recurrent Neural Networks

Autoencoders

Generative Adversarial Networks

Attention and Transformers

Graph Neural Networks

Conclusion

Final Quiz

Optimization and Gradient Descent

Slope: the derivative of the loss function

Computing the gradient of a loss function