Progressive GAN
Understand the workings of the progressive GAN and how it can be implemented using TensorFlow 2.0.
GANs are powerful systems to generate high-quality samples, examples of which we have seen in the previous sections. Different works have utilized this adversarial setup to generate samples from different distributions like CIFAR-10, celeb_a, LSUN-bedrooms, and so on (we covered examples using MNIST for explanation purposes). Some works, like Lap-GANs, have focused on generating higher-resolution output samples, but they lacked perceived output quality and introduced additional challenges in training. Progressive GANs or Pro-GANs or PG-GANs were presented by Karras et al. in their work titled “GANs for Improved Quality, Stability, and
The method presented in this work not only mitigated many of the challenges present in earlier works but also brought about a very simple solution to crack this problem of generating high-quality output samples. The paper also presented a number of very impactful contributions, some of which we'll cover in detail in the following subsections.
The overall method
The software engineering way of solving tough technical problems is often to break them down into simpler granular tasks. Pro-GANs also target the complex problem of generating high-resolution samples by breaking down the task into smaller and simpler problems to solve. The major issue with high-resolution images is the huge number of modes or details such images have. It makes it very easy to differentiate between generated samples and the real data (perceived quality issues). This inherently makes the task of building a generator with enough capacity to train well on such datasets, along with memory requirements, a very tough one.
To tackle these issues, Karras et al. presented a method to grow both generator and discriminator models as the training progresses from lower to higher resolutions
Generating higher-resolution images step by step is not an entirely new idea. Many prior works used similar techniques, but the method used in this proposed system was most similar to the layer-wise training of
The system learns by starting with lower-resolution samples and a generator-discriminator set up as mirror images of each other (architecture-wise). At a lower resolution (say
Despite improvements, the training time and compute requirements for Pro-GANs are huge. To generate said megapixel outputs, multiple GPUs require a training time of up to a week.
In the following subsections, we’ll explore the important contributions and implementation-level details. Keeping the requirements in check, we’ll cover component-level details but use TensorFlow Hub to present the trained model (instead of training one from scratch). This will enable us to focus on the important details and leverage pre-built blocks as required.
Progressive growth-smooth fade-in
Pro-GANs were introduced as networks that increase the resolution step by step by adding additional layers to generator and discriminator models. But how does that actually work? The following is a step-by-step explanation:
The generator and discriminator models start with a resolution of
...