The Case for Backpropagation

Learn about the fundamentals of backpropagation in neural networks.

Recap and follow up of our neural network

Before we jump into the backpropagation, let’s have a quick review of what we have learned about neural networks so far.

In the previous chapter, we wrote a functioning neural network, or at least half of it. The network’s prediction code is done: it passes data through the model, and produces labels. However, that process also requires a set of weights, and we haven’t written the code that finds those weights yet. We’ll do it in this chapter by implementing the train() function of our neural network.

In the early years of neural networks, training was tough. AI experts even questioned whether they could be trained at all. The answer came in the early 1970s, when researchers found a way to calculate the gradient of a network with an algorithm called backpropagation or backprop.

There is a chance that we’ll never have to implement backpropagation in a real-life project. Modern ML libraries already come with their own ready-made implementations. However, it is still important that we get an intuitive sense of how backpropagation works so that we are well equipped to deal with its subtle consequences.

In this chapter, we’ll see how backpropagation works, and we’ll implement it for our neural network. We’ll also learn to initialize the network’s weights so that they fit well with backprop. At the end of this chapter, we’ll launch the network through its initial run. Will it beat the perceptron’s accuracy? If so, by how much?

The need for backpropagation

We’ll explain backpropagation from the ground up soon. But first, let’s see why people use backpropagation.

Get hands-on with 1300+ tech skills courses.