...

/

Appendix D: Unstable Learning

Appendix D: Unstable Learning

Learn about unstable learning and gradient descent for adversarial training.

Is gradient descent suitable for training GANs?

When training neural networks we use gradient descent to find a path down a loss function to find the combination of learnable parameters that minimize the error. This is a very well researched area and techniques today are very sophisticated. The Adam optimiser is a good example.

The dynamics of a GAN are different from a simple neural network. The generator and discriminator networks are trying to achieve opposing objectives. There are parallels between a GAN and adversarial games where one player is trying to maximize an objective while the other is trying to minimize it, each undoing the benefit of the opponent’s previous move.

Is the gradient descent method suitable for such adversarial games? This might seem like an unnecessary question, but the answer is rather interesting.

Simple adversarial example

The following is a very simple objective function:

f=x.yf = x.y

One player has control over the values of xx and is trying to maximize the objective ff. A second player has control over yy and is trying to minimize the objective ff.

Let’s visualize this function to get a feel for it. The following picture shows a surface plot of ff =xyx·y from three different angles.

We can see that the surface of ff =xy= x·y is a saddle. That means that along one direction the values rise then fall, but in another direction, the values fall then rise.

The following picture shows the same function from above, using colors to indicate the values of ff. Also marked are the directions of the increasing gradient.

If we used our intuition to find a good solution to this adversarial game, we would probably say the best answer is the middle of that saddle at (x,y)=(0,0)(x,y) = (0,0). At this point, if one player sets x=0x = 0, the second player can’t affect the value of ff, no matter what value of yy is chosen. The same applies if y=0y = 0, no value of xx can change the value of ff. The actual value of ...