Fixing the problem

Unfortunately, we can’t completely rely on the very last training example. This is an important idea in machine learning. To improve the results, we moderate the updates. This means that instead of jumping enthusiastically to each new $A$ , we take a fraction of the change $ΔA$ . That way, we move in the direction that the training example suggests, but we do so cautiously and keep some of the previous value, which we potentially arrived at through many training iterations. We saw this idea of moderating our refinements before with the simpler miles to kilometers predictor, where we nudged the value parameter $c$ just a fraction of the actual error.

Press + to interact

Moderation has another very powerful and useful side effect. When the training data itself can’t be trusted to be perfectly true and contains errors or noise, both of which are normal in real-world measurements, using moderation can dampen the impact of those errors or noise. It smooths them out. Let’s rerun our calculation, but this time we’ll add a moderation into the update formula:

ΔA = L (E / x)

The moderating factor is often called a learning rate, and we’ve called it $L$ . Let’s pick $L = 0.5$ as a reasonable fraction just to get started. It means we only update half as much as we would have done without moderation.

Running through it again, we have an initial $A = 0.25$ ...

Prologue

A Little Background

Let's Get Started!

Backward Propagation of Error

Adjusting the Link Weights

A Gentle Start with Python

Neural Network with Python

Testing Neural Network against MNIST Dataset

Some Suggested Improvements

Even More Fun!

Epilogue

Appendix: A Small Guide to Calculus

Set Up a Learning Rate in the Training Classifier

Fixing the problem