Use Gradient Descent to Update Weights

Learn how to design the function that will pass to gradient descent algorithm to update the weights in our network.

We'll cover the following...

The calculus behind error minimization
Update weights between hidden and output layers

The calculus behind error minimization

To do gradient descent, we need to work out the slope of the error function with respect to the weights. This requires calculus. Calculus is simply a mathematically precise way of working out how something changes when something else does. For example, we could calculate how the length of a spring changes as the force used to stretch it changes. Here, we’re interested in how the error function depends on the link weights inside a neural network. Another way of asking this is, “How sensitive is the error to changes in the link weights?”

Let’s start with a picture, because that helps keep us visualize what we are trying to achieve.

Press + to interact

The graph is just like the one we saw before. We’re not doing anything different. This time, the function we’re trying to minimize is the neural network’s error. The parameter we’re trying to refine is a network link weight. In this simple example, we’ve only shown one weight, but we know neural networks will have many more.

The next diagram shows two link weights, and this time the error function is a three-dimensional surface that varies as the two link weights vary. We can see we’re trying to minimize the error, which is now more like a mountainous landscape with a valley.

Press + to interact

Prologue

A Little Background

Let's Get Started!

Backward Propagation of Error

Adjusting the Link Weights

A Gentle Start with Python

Neural Network with Python

Testing Neural Network against MNIST Dataset

Some Suggested Improvements

Even More Fun!

Epilogue

Appendix: A Small Guide to Calculus

Use Gradient Descent to Update Weights

The calculus behind error minimization