Prepare Data: Random Initial Weights

Learn how to initialize the weights randomly, and to address the problems in initial weight selection.

We'll cover the following...

Random initiazation of weights
Bias and saturation
Problem with zero weights
Key points

Random initiazation of weights

The same argument applies here as with the inputs and outputs. We should avoid large initial weights because they cause large signals into an activation function, leading to the saturation we just talked about, and the reduced ability to learn better weights.

We could choose initial weights randomly and uniformly from a range of $-1.0$ to $+1.0$ . That would be a much better idea than using a very large range, say $-1000$ to $+1000$ .

Mathematicians and computer scientists have done the math to work out a rule of thumb for setting the random initial weights given specific shapes of networks and with specific activation functions.

We won’t go into the details of that, but the core idea is that if we have many signals into a node, which we do in a neural network, and if these signals are already well behaved and are not too large or randomly distributed, then the weights should support keeping those signals well behaved because they are combined and the activation function ...

Prologue

A Little Background

Let's Get Started!

Backward Propagation of Error

Adjusting the Link Weights

A Gentle Start with Python

Neural Network with Python

Testing Neural Network against MNIST Dataset

Some Suggested Improvements

Even More Fun!

Epilogue

Appendix: A Small Guide to Calculus

Prepare Data: Random Initial Weights

Random initiazation of weights