Upgrade the Learner

Prepare data by adding more dimensions and upgrading the algorithm according to the updated dataset.

After a mathematical detour, we can return to work at hand. We want to upgrade our learning program to deal with multiple input variables. Let’s make a plan of action so that we do not get lost in the process:

  1. First, we’ll load and prepare the multidimensional data, to feed them to the learning algorithm.
  2. After preparing the data, we’ll upgrade all the functions in our code to use the new model. We’ll switch from a line to a more generic weighted sum, as mentioned in Adding More Dimensions.

Prepare data

ML is all about building amazing AIs. The reality is that a large part of the job is preparing data for the learning algorithm. To do that, let’s start from the file that contains our dataset:

In the previous chapters, this file had two columns, which we loaded into two arrays with NumPy’s loadtxt(). Now that we have multiple input variables, XX needs to become a matrix like this:

pizza_3_vars.txt

Each row in XX is an example, and each column is an input variable.

If we load the file with loadtxt(), as we did before, we’ll get a NumPy array for each column:

import numpy as np
x1, x2, x3, y = np.loadtxt("pizza_3_vars.txt", skiprows=1, unpack=True)

Arrays are NumPy’s distinctive feature. They are very flexible objects that can represent anything from a scalar (a single number) to a multidimensional structure. However, that same flexibility makes arrays somewhat hard to grasp at first. We’ll understand how to mold those four arrays into the XX and YY variables. We recommend NumPy’s documentation handy when doing this.

To determine the dimensions of an array, we can use its shape operation:

x1.shape # => (30, )

All four columns have 30 elements, one for each example in pizza_3_vars.txt. That dangling comma in NumPy’s represents g that these arrays have just one dimension

Let’s build the XX matrix by joining the first three arrays together:

X = np.column_stack((x1, x2, x3))
X.shape # => (30, 3)

Here are the first two rows of XX:

X[:2] # => array([[13., 26., 9.], [2., 14., 6.]])

NumPy’s indexes are powerful, and sometimes confusing. The notation [:2][:2] in this code is a shortcut for ...

Access this course and 1400+ top-rated courses and projects.