Prepare Data: Inputs and Outputs
Learn how to prepare the training data, initial random weights, and design the outputs.
We'll cover the following...
An example of a weight update
Not all attempts at using neural networks will work well. Some of the reasons they don’t work well can be addressed by thinking about the training data and the initial weights, and designing a good output scheme. Let’s look at each of these in turn.
Prepare inputs
Let’s have a look at the diagram of the sigmoid activation function below. We can see that if the inputs are large, the activation function gets very flat.
A very flat activation function is problematic because we use the gradient to learn new weights. Look back at that expression for the weight changes. It depends on the gradient of the activation function. A small gradient means we’ve limited the network’s ability to learn. This is called saturating a neural network. ...