Stochastic Gradient Descent (SGD)

Learn how to apply the threshold function and calculate errors.

Naming and initializations

The program implementation of the simple perceptron in Python is shown in the code below. The program starts with defining the training problem (the training dataset) in the feature arrays X and desired label vector Y. The columns of the matrix XX correspond to the feature vector of each sample, and the columns represent all the training samples. We then introduce and initialize some variables, specifically the number of input nodes Ni and output nodes No. The weight matrix to the output nodes wo is initialized randomly, and the dwo are the changes (gradients) of the weights. Lastly, do is the delta term of the output node.

Get hands-on with 1400+ tech skills courses.