Loss Function

Find out the difference between loss functions for regression and classification and which loss function is best for the MNIST classifier.

Mean squared error vs. binary cross-entropy loss

In the last section, we developed a neural network to classify images of hand-written digits. Even though we intentionally designed the network to be simple, it worked remarkably well, getting an accuracy score of about 87% with the MNIST test dataset.

Here we’ll explore some refinements which can help us improve a network’s performance. Some neural networks are designed to produce a continuous range of output values. For example, we might want a network predicting temperatures to be able to output any value in the range 0 to 100 degrees centigrade.

Some networks are designed to ...