Fundamentals of Machine Learning for Software Engineers/

...

Apply Backpropagation

Learn how to calculate the gradients of weights while applying backpropagation.

We'll cover the following...

Three layered network
Staying the course
Calculate the gradient of w2
Calculate the gradient of w1
Distill the back function

All the variables in the preceding graph are matrices with the exception of the loss $L$ . The $σ$ symbol represents the sigmoid. we combine the softmax and the cross-entropy loss into one operation. We need a name for that operation, so we temporarily call it softmax and loss (SML). Finally, we give the names $a$ and $b$ to the outputs of the matrix multiplications.

The diagram shown earlier represents the same neural network that we designed and built in the previous two chapters. Let’s follow its operations from left to right:

$x$ and $w_1$ get multiplied and passed through the sigmoid, producing the hidden layer $h$ .
Then, $h$ is multiplied by $w_2$ and passed through the softmax and the loss function, producing the loss $L.$

We learned backpropagation so we can calculate the gradients of $L$ with respect to $w_1$ and $w_2$ . To apply the chain rule, we need the local gradients on the paths back from $L$ to $w_1$ and $w_2$ .

How Machine Learning Works

Our First Learning Program

Walking the Gradient

Hyperspace

A Discern Machine

Get Real

The Final Challenge

The Perceptron

Designing the Network

Building the Network

Training the Network

How Classifiers Work

Batchin’ Up

The Zen of Testing

Let’s Do Development

A Deeper Kind of Network

Diabetes Prediction Using Keras

Defeating Overfitting

Taming Deep Networks

Beyond Vanilla Networks

Into the Deep

Recognize Handwritten Digits Using a Deep Neural Network

Machine Learning Fundamentals

Apply Backpropagation

Three layered network