Hands-On Generative Adversarial Networks with PyTorch/

...

Optimizations and Learning Rate

Explore different optimization methods and how to adjust learning rate.

We'll cover the following...

Here, we will only discuss gradient-based optimization methods, which are most commonly used in GANs. Different gradient methods have their own strengths and weaknesses. There isn't a universal optimization method that can solve every problem. Therefore, we should choose them wisely when it comes to different practical problems.

Types of optimization methods

Let’s have a look at some now:

SGD (calling optim.SGD with momentum=0 and nesterov=False): It works fast and well for shallow networks. However, it can be very slow for deeper networks and may not even converge for deep networks:

In this equation, $\theta_t$ is the parameters at iteration step $t$ , $\eta$ is the learning rate, and $\nabla J$ is the gradient of the objective function, $J$ .

Momentum (calling optim.SGD with the momentum argument when it's larger than 0 and nestrov=False): It is one of the most commonly used optimization methods. This method combines the updates of the previous step with the gradient at the current step so that it takes a smoother trajectory than SGD. The training speed of Momentum is often faster than SGD and it generally works well for both shallow and deep networks:

Getting Started

Generative Adversarial Networks Fundamentals

Best Practices for Model Design and Training

Building Our First GAN with PyTorch

Generating Images Based on Label Information

Image-to-Image Translation and Its Applications

Image Restoration with GANs

Training GANs to Break Different Models

Image Generation from Description Text

Sequence Synthesis with GANs

Reconstructing 3D Models with GANs

Concluding Remarks

Appendix

Optimizations and Learning Rate

Types of optimization methods