Multilayer Perceptrons and Backpropagation
Learn the basics of neural networks and the backpropagation algorithm.
We'll cover the following...
While large research funding for neural networks declined until the 1980s after the publication of Perceptrons, researchers still recognized that these models had value, particularly when assembled into multilayer networks, each composed of several perceptron units. Indeed, when the mathematical form of the output function (that is, the output of the model) was relaxed to take on many forms (such as a linear function or a sigmoid), these networks could solve both regression and classification problems, with theoretical results showing that three-layer networks could effectively approximate any
Renewed interest in neural networks came with the popularization of the backpropagation algorithm, which, while discovered in the 1960s, was not widely applied to neural networks until the 1980s, following several studies highlighting its usefulness for learning the weights in these
The insight of the backpropagation technique is that we can use the chain rule from calculus to efficiently compute the derivatives of each parameter of a network with respect to a loss function, and, combined with a learning rule, this provides a scalable way to train multilayer networks.
Let’s illustrate backpropagation with an example: consider a network like the one shown
Furthermore, the value
We also need a notion of when the network is performing well or badly at its task. A straightforward error function to use here is a squared loss:
where