Fundamentals of Machine Learning for Software Engineers/

...

Put It All Together

Test the final gradient descent algorithm used in the hyperspatial dataset

We'll cover the following...

Checklist
Remove bias
A final test drive
Summary

Press + to interact

Python 3.5

Files

import numpy as np
# computing the predictions
def predict(X, w):
    return np.matmul(X, w)
# calculating the loss
def loss(X, Y, w):
    return np.average((predict(X, w) - Y) ** 2)
# evaluating the gradient
def gradient(X, Y, w):
    return 2 * np.matmul(X.T, (predict(X, w) - Y)) / X.shape[0]
# performing the training phase for our classifier
def train(X, Y, iterations, lr):
    w = np.zeros((X.shape[1], 1))
    for i in range(iterations):
        print("Iteration %4d => Loss: %.20f" % (i, loss(X, Y, w)))
        w -= gradient(X, Y, w) * lr
    return w
# loading the data first and then training the classifier for 50,000 iteration
x1, x2, x3, y = np.loadtxt("pizza_3_vars.txt", skiprows=1, unpack=True)
X = np.column_stack((x1, x2, x3))
Y = y.reshape(-1, 1)
w = train(X, Y, iterations=50000, lr=0.001)

This code is very similar to the code from the previous chapter. Aside from the part that loads and prepares the data, we have changed just three lines. Also note that the functions are generic. They not only can process the pizza three-variables dataset, but they also would work just as well with an arbitrary number of input variables.

After running the program, here’s what we get:

Iteration 0 => Loss: 1333.56666666666660603369
Iteration 1 => Loss: 151.14311361881479456315
Iteration 2 => Loss: 64.99460808656147037254
…

Iteration 99999 => Loss: 6.89576133146784187034

The loss decreases at each iteration, and that’s a hint that the program is indeed learning. However, our job is not quite done yet. Remember that we removed the bias parameter at the beginning of this discussion to make things easier, and we know that we should not expect good predictions without the bias. Fortunately, putting the bias back is easier than we might think.

Remove bias

So far, we have implemented this prediction formula:

\large{\hat{y} = x_1 * w_1 + x_2 * w_2 + x_3 * w_3}

How Machine Learning Works

Our First Learning Program

Walking the Gradient

Hyperspace

A Discern Machine

Get Real

The Final Challenge

The Perceptron

Designing the Network

Building the Network

Training the Network

How Classifiers Work

Batchin’ Up

The Zen of Testing

Let’s Do Development

A Deeper Kind of Network

Diabetes Prediction Using Keras

Defeating Overfitting

Taming Deep Networks

Beyond Vanilla Networks

Into the Deep

Recognize Handwritten Digits Using a Deep Neural Network

Machine Learning Fundamentals

Put It All Together

Checklist

Remove bias