Preparing Data

Learn about the preparation steps of a neural network including handling of data, tweaking the range of input variables, and standardizing them.

Revise neural network

We spent a few chapters building a neural network, and a few more investigating its finer points. In this chapter, we’ll come down to the wire and aim for 99% accuracy on MNIST. To get there, we’ll follow an iterative process that is the ML equivalent of software development. In fact, we can call it development.

Like software development, ML development is extensive to fit in this chapter or this course. It involves people with different skills, from mathematicians to engineers. Even the engineering part of the job is more than just building a neural network and tuning it. Real-life ML systems are often complicated pipelines composed of multiple algorithms and services. To make things harder, ML development is often an art as well as a science: it requires plenty of experience, educated guesses, and luck.

Let’s start practicing. We’ll look at ML development in a nutshell, focusing on three activities:

  1. We’ll solve our problem with a neural network. First, We’ll prepare our dataset for that network. For example, we’ll rescale the input variables to make them more network-friendly.
  2. Second, we’ll move into the development cycle. At each step, we’ll improve the network’s accuracy by tuning its hyperparameters: lr, the batch size, and so on.
  3. Finally, at the end of the process, we’ll put the network to a final test.

Along the way, remember the testing strategy explained in The Zen of Testing. We have three sets of examples: training, validation, and testing. We’ll not use the test set right now and ignore it until the final test at the end of the process. Instead, during the development cycle, we’ll use the validation set to measure the network’s performance. In fact, the validation set is sometimes called the dev set because it’s used during development.

Get hands-on with 1400+ tech skills courses.