...

/

Choose the Right Weights Iteratively

Choose the Right Weights Iteratively

Derive a simplified expression for error differentiation using the sigmoid function to find the right weights.

Differentiate the error

Choosing the right weights directly is too difficult. An alternative approach is to iteratively improve the weights by descending the error function and taking small steps. Each step is in the direction of the greatest downward slope from our current position.

This means that the error function didn’t need to sum all the output nodes in the first place. The reason is that the output of a node only depends on the connected links and hence their weights. This fact is sometimes glossed over, and sometimes the error function is simply stated without an explanation.

Here is our simpler expression:

Ewjk=wjk(tkok)2 \frac{\partial E}{\partial w_{jk}} = \frac{\partial}{\partial w_{jk}}(t_k - o_k)^2

Now, we will do a bit of calculus.

That tkt_k part is a constant, so it doesn’t vary like wjkw_{jk} varies. This means tkt_k isn’t a function of wjkw_{jk}. If we think about it, it would be really strange if the truth examples providing the target values changed depending on the weights. That leaves the oko_k part, which we know depends on wjkw_{jk} because the weights are used to feed the signal forward to become the outputs oko_k.

We’ll use the chain rule to break this differentiation task into more manageable pieces:

Ewjk=Eokokwjk \frac{\partial E}{\partial w_{jk}} = \frac{\partial E}{\partial o_{k}}\cdot \frac{\partial o_k}{\partial w_{jk}} ...

Access this course and 1400+ top-rated courses and projects.