What We’ve Learned So Far

Explore the influence of GANs on machine learning, model architectures, loss functions, and training techniques.

The GAN framework has revolutionized the fields of machine learning and deep learning. Throughout the chapters in this course, we implemented many models in many domains that were part of this revolution, including images, text, and audio.

Generative models

We learned about deep learning and generative models in general, and their applications in AI. We covered many topics, including GANs, autoregressive models, variational autoencoders, and reversible flow models.

We described in detail the building blocks of GANs, including their strengths and limitations. We also learned how to visualize their results and evaluate them qualitatively and quantitatively.

Architectures

We learned how conditional GANs (CGANs) work and how to implement them. We also learned how to set up and train GANs, which allows us to control the characteristics of GAN outputs, such as vector arithmetic in latent space.

We learned how to implement several model architectures, including DCGAN, ResNet, and U-Net.

Loss functions

We learned about multiple loss functions, the problems they claim to solve, and how to implement them. We looked at the original GAN loss, the Wasserstein GAN (WGAN), the Wasserstein GAN with gradient penalty (WGAN-GP), the Least Squares GAN (LSGAN), and the Relativistic GAN (RGAN).

We also looked into adding terms to the GAN loss function, including feature matching loss, reconstruction loss, and VGG loss.

Tricks of the trade

We learned how to train GANs and circumvent the many challenges that are involved in doing so. We learned how to first identify problems with GANs and then solve them using tricks of the trade.

Implementations

We learned to implement several models in different domains. In the image domain, we implemented pix2pix, pix2pixHD, and progressive growing of GANs. In the text domain, we implemented natural language generation with GANs. In the audio domain, we implemented sound enhancement GANs.

Combining the image and text domains, we implemented text-to-image synthesis. Finally, we also looked into identifying GAN samples.

We have done a lot, but there is still a lot to do. GANs are an ongoing field of research, and there are many questions that still need to be answered! So let’s look at them.

Get hands-on with 1200+ tech skills courses.