Gradient Descent: The Batch Update
Update the parameters using the Batch Gradient Descent.
Exploratory data analysis
We have two features X1 and X2 and a label. Each feature has ten data points and the label is 0 or 1. Make a decision boundary that separates the two classes.
Letβs look at the
π Note: Remember, we can only apply the perceptron algorithm if the data is linearly separable. For this, we need to draw the points on the graph to visualize the data.
Draw the data points on the
Coding the perceptron training rule
Press + to interact
import numpy as npimport matplotlib.pyplot as pltdef sigmoid(z):"""The sigmoid activation function on the input x"""return 1 / (1 + np.exp(-z))def forward_propagation(X, W, b):"""Computes the forward propagation operation of a perceptron andreturns the output after applying the sigmoid activation function"""weighted_sum = np.dot(X, W) + b # calculate the weighted sum of X and Wprediction = sigmoid(weighted_sum) # apply the sigmoid activation functionreturn predictiondef calculate_error(y, y_predicted):"""Computes the binary cross entropy error"""loss = np.sum(- y * np.log(y_predicted) - (1 - y) * np.log(1 - y_predicted)) # calculate errorreturn lossdef gradient(X, Y, Y_predicted):""""Gradient of weights and bias"""Error = Y_predicted - Y # Calculate errordW = np.dot(X.T, Error) # Compute derivative of error w.r.t weight, i.e., (target - output) * xdb = np.sum(Error) # Compute derivative of error w.r.t biasreturn dW, db # return derivative of weight and biasdef update_parameters(W, b, dW, db, learning_rate):"""Updating the weights and bias value"""W = W - learning_rate * dW # update weightb = b - learning_rate * db # update biasreturn W, b # return weight and biasdef train(X, Y, learning_rate, W, b, epochs, losses):"""Training the perceptron using batch update"""for i in range(epochs): # loop over the total epochsY_predicted = forward_propagation(X, W, b) # compute forward passlosses[i, 0] = calculate_error(Y, Y_predicted) # calculate errordW, db = gradient(X, Y, Y_predicted) # calculate gradientW, b = update_parameters(W, b, dW, db, learning_rate) # update parametersreturn W, b, losses# Initialize parameters# featuresX = np.array([[2.78, 2.55],[1.46, 2.36],[3.39, 4.40],[1.38, 1.85],[3.06, 3.00],[7.62, 2.75],[5.33, 2.08],[6.92, 1.77],[8.67, -0.24],[7.67, 3.50]])Y = np.array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1]) # target labelweights = np.array([0.0, 0.0]) # weights of perceptronbias = 0.0 # bias valueepochs = 10000 # total epochslearning_rate = 0.01 # learning ratelosses = np.zeros((epochs, 1)) # compute lossprint("Before training")print("weights:", weights, "bias:", bias)print("Target labels:", Y)W, b, losses = train(X, Y, learning_rate, weights, bias, epochs, losses)# Evaluating the performanceplt.figure()plt.plot(losses)plt.xlabel("EPOCHS")plt.ylabel("Loss value")plt.show()plt.savefig('output/legend.png')print("\nAfter training")print("weights:", W, "bias:", b)# Predict valueA2 = forward_propagation(X, W, b)pred = (A2 > 0.5) * 1print("Predicted labels:", pred)
Explanation
Initialization parameters
The table summarizes the initialized parameters:
Variables | Definition |
---|---|
X |
An input feature array of size 10 * 2 |
Y |