Initialize the Weights
Discover why it is important to initialize weights and avoid dead neurons while building the neural network.
We'll cover the following...
In
Fearful symmetry
Here is one rule to keep in mind: never initialize all the weights in a neural network with the same value. The reason for that recommendation is subtle, and comes from the matrix multiplications in the network. For example, let’s look at this matrix multiplication below:
We don’t need to remember the details of matrix multiplication (we can review the details in the Multiplying matrices section). The interesting detail in this example is that the numbers in the first matrix are all different, the result has two identical columns because of the uniformity of the second matrix. In general, if the second matrix in the multiplication has the same value in every cell, the result will have the same values in every row.
Now imagine that the first and second matrices are respectively (the inputs) and (the first layer’s weights) of a neural network. Once the multiplication is done, the resulting matrix passes through a sigmoid and results in the hidden layer . Now h has the same values in each row, it means that all the hidden nodes of the network have the same value. And when we initialize all the weights with the same value, we force ...