...

/

Assembling a Neural Network from Perceptrons

Assembling a Neural Network from Perceptrons

Learn how to design a neural network by assembling perceptrons.

Recap

The first partFrom chapter-1 (How Machine Learning Works To chapter-8 (The Perceptron)) of this course was all about the perceptron. The second partFrom chapter-9 (Designing the Network) To chapter-15 (Let’s Do Development) explains the advanced concepts in machine learning, and the most important idea in this chapter is the neural network. Neural networks are way more powerful than perceptrons. In the Where Perceptrons Fail lesson, we learned that perceptrons need simple linearly separable data. By contrast, neural networks can deal with unorganized data, like photos of real-world objects.

Even for a simple dataset like MNIST, our perceptron was scraping by, making almost one mistake every ten characters. With neural networks, we can aim for an order of magnitude better accuracy. In this part of the chapter, we’ll build an MNIST classifier that reaches 99% accuracy, which is one error every 100 characters.

Now let’s design a neural network that classifies MNIST digits.

Assemble perceptrons

Let’s see how to build a neural network, starting with the perception that we already have. As a reminder, the perceptron is a weighted sum of the inputs, followed by a sigmoid:

In the first part of this chapter, we did not just use the perceptron as it is, we combined perceptrons in two different ways. First, we trained the perceptron with many MNIST images at once. Second, we used ten perceptrons to classify the ten possible digits. We have already compared those two operations to stacking and parallelizing perceptrons, respectively.

To be clear, we did not literally stack and parallelize perceptrons. Instead, we used matrices to get a similar result. Our perceptron’s input was a matrix with one row per image; and our perceptron’s output was a matrix with ten columns, one per class. The stacking and parallelizing metaphors are just convenient shortcuts to describe those matrix-based calculations.

Now we’ll take these extended (stacked and parallelized) perceptrons, and use them as building blocks for a neural network.

Chain perceptrons

We can build a neural network by serializing two perceptrons, shown as below:

As we can see, each perceptron has its own weights and its own sigmoid operation, but the outputs of the first perceptron are also the inputs of the second. To avoid confusion, letter hh is used for the value and dd ...

Access this course and 1400+ top-rated courses and projects.