Fundamentals of Machine Learning for Software Engineers/

...

Upgrade the Learner

Prepare data by adding more dimensions and upgrading the algorithm according to the updated dataset.

We'll cover the following...

Prepare data
Upgrade prediction
Upgrad the loss
Upgrade the gradient

After a mathematical detour, we can return to work at hand. We want to upgrade our learning program to deal with multiple input variables. Let’s make a plan of action so that we do not get lost in the process:

First, we’ll load and prepare the multidimensional data, to feed them to the learning algorithm.
After preparing the data, we’ll upgrade all the functions in our code to use the new model. We’ll switch from a line to a more generic weighted sum, as mentioned in Adding More Dimensions.

Prepare data

ML is all about building amazing AIs. The reality is that a large part of the job is preparing data for the learning algorithm. To do that, let’s start from the file that contains our dataset:

Each row in $X$ is an example, and each column is an input variable.

If we load the file with loadtxt(), as we did before, we’ll get a NumPy array for each column:

import numpy as np
x1, x2, x3, y = np.loadtxt("pizza_3_vars.txt", skiprows=1, unpack=True)

Arrays are NumPy’s distinctive feature. They are very flexible objects that can represent anything from a scalar (a single number) to a multidimensional structure. However, that same flexibility makes arrays somewhat hard to grasp at first. We’ll understand how to mold those four arrays into the $X$ and $Y$ variables. We recommend NumPy’s documentation handy when doing this.

To determine the dimensions of an array, we can use its shape operation:

x1.shape # => (30, )

All four columns have 30 elements, one for each example in pizza_3_vars.txt. That dangling comma in NumPy’s represents g that these arrays have just one dimension

Let’s build the $X$ matrix by joining the first three arrays together:

X = np.column_stack((x1, x2, x3))
X.shape # => (30, 3)

Here are the first two rows of $X$ :

X[:2] # => array([[13., 26., 9.], [2., 14., 6.]])

NumPy’s indexes are powerful, and sometimes confusing. The notation $[:2]$ in this code is a shortcut for $[0:2]$ , that means the rows with index from zero to $1$ ( $2$ excluded), that is, the first two rows.

Now that we have taken care of $X$ , let’s look at $y$ with one-dimensional $(30,)$ shape.

A useful trick is that we should avoid mixing NumPy matrices and one-dimensional arrays. Code that involves both can have surprising behavior. For this reason, as soon as we have a one-dimensional array, it’s better to reshape it into a matrix with the reshape() function:

Y = y.reshape(-1, 1)

The reshape() takes the dimensions of the new array. If one dimension is -1, then NumPy will set it to whatever makes the other dimensions fit. So the preceding line means that we need to reshape $Y$ so that it’s a matrix with $1$ column, and as many rows as we need to fit the current elements. The result is a $(30,1)$ matrix:

How Machine Learning Works

Our First Learning Program

Walking the Gradient

Hyperspace

A Discern Machine

Get Real

The Final Challenge

The Perceptron

Designing the Network

Building the Network

Training the Network

How Classifiers Work

Batchin’ Up

The Zen of Testing

Let’s Do Development

A Deeper Kind of Network

Diabetes Prediction Using Keras

Defeating Overfitting

Taming Deep Networks

Beyond Vanilla Networks

Into the Deep

Recognize Handwritten Digits Using a Deep Neural Network

Machine Learning Fundamentals

Upgrade the Learner

Prepare data