About the Course

Get introduced to the course contents, intended audience, and overall structure.

Generative adversarial networks (GANs) have revolutionized the fields of machine learning and deep learning. This course will be our first step toward understanding GAN architectures and tackling the challenges involved in training them.

This course opens with an introduction to deep learning and generative models, and their applications in artificial intelligence (AI). We will then learn how to build, evaluate, and improve our first GAN with the help of easy-to-follow examples. The next few chapters will guide us through training a GAN model to produce and improve high-resolution images. We will also learn how to implement conditional GANs that give us the ability to control the characteristics of GAN outputs. We will further build on our knowledge by exploring a new training methodology for progressive growing of GANs. Moving on, we’ll gain insights into state-of-the-art models in image synthesis, speech enhancement, and natural language generation using GANs. In addition to this, we’ll be able to identify GAN samples with TequilaGAN.

By the end of this course, we will be well-versed with the latest advancements in the GAN framework using various examples and datasets, and we will have the skills needed to implement GAN architectures for several tasks and domains, including computer vision, natural language processing (NLP), and audio processing.

Who this course is for

This course is for machine learning practitioners, deep learning researchers, and AI enthusiasts who are looking for a perfect mix of theory and hands-on content in order to implement GANs using Keras. A working knowledge of Python is expected.

Prerequisites

This course will give an overview of data analysis in Python. It will take us through the main libraries of Python’s data science stack. It will explain how to use various Python tools to analyze, visualize, and process data effectively, and we will learn about the importance of using GPUs in deep learning. The reader must have software and hardware experience in Python development.

What this course covers

  • Deep Learning Basics and Environment Setup: This chapter contains essential knowledge for building and training deep learning models, including GANs. We will also learn how to set up our deep learning Python and Keras environments for upcoming projects. Finally, we will learn about the importance of using GPUs in deep learning and how to choose the platform that best suits us.

  • Introduction to Generative Models: This chapter covers the basics of generative models, including GANs, variational autoencoders, autoregressive models, and reversible flow models. We will learn about state-of-the-art applications that use GANs. We will learn the building blocks of GANs, along with their strengths and limitations.

  • Implementing Our First GAN: This chapter explains the basics of implementing and training a GAN for image synthesis. We will learn how to implement the generator and discriminator in a GAN, how to implement our loss function and how to use it to train our GAN models, and how to visualize the samples from our first GAN. We will focus on the well-known CIFAR10 dataset, with 60,000 32 by 32 color images in 10 classes, naturally including dogs and cats.

  • Evaluating Our First GAN: This chapter covers how to use quantitative and qualitative methods to evaluate the quality and variety of the GAN samples we produced in the previous chapter. We will learn about the challenges involved in evaluating GAN samples, how to implement metrics for image quality, and how to use the birthday paradox to evaluate sample variety.

  • Improving Our First GAN: This chapter explains the main challenges in training and understanding GANs and how to solve them. We will learn about vanishing gradients, mode collapse, training instability, and other challenges. We will learn how to solve the challenges that arise when training GANs by using tricks of the trade and improving our GAN architecture and loss function. We will learn about multiple deep learning model architectures that have been successful with the GAN framework. Furthermore, we will learn how to improve our first GAN by implementing new loss functions and algorithms. We will continue to focus on the CIFAR-10 dataset.

  • Synthesizing and Manipulating Images with GANs: This chapter explains how to implement pix2pixHD, a method for high-resolution (such as 2048 x 1024) photo-realistic image-to-image translation. It can be used to turn semantic label maps into photo-realistic images or to synthesize portraits from face label maps. We will use the Cityscapes dataset, which focuses on a semantic understanding of urban street scenes.

  • Progressive Growing of GANs: This chapter explains how to implement the progressive growing of GANs framework, a new training methodology in which the generator and discriminator are trained progressively. Starting from a low resolution, we will add new layers that model increasingly fine details as training progresses. This speeds up the training process and stabilizes it, allowing us to produce images of unprecedented quality. We will focus on the CelebFaces Attributes (CelebA) dataset, a face attributes dataset with over 200,000 celebrity images.

  • Generation of Discrete Sequences Using GANs: This chapter covers the implementation of adversarial generation of natural language, a model capable of generating sentences in multiple languages from context-free and probabilistic context-free grammars. We will learn how to implement a model that generates sequences character by character and a model that generates sentences word by word. We will focus on the Google 1-billion-word dataset.

  • Text-to-Image Synthesis with GANs: This chapter explains how to implement generative adversarial text-to-image synthesis: a model that generates plausible images from detailed text descriptions. We will learn about matching-aware discriminators, interpolations in embedding space, and vector arithmetic. We will focus on the Oxford-102 Flowers dataset.

  • Speech Enhancement with GANs: This chapter covers the implementation of a speech enhancement GAN, a framework for audio denoising and speech enhancement using GANs. We will learn how to train the model with multiple speakers and noise conditions. We will also learn how to evaluate the model qualitatively and quantitatively. We will focus on the WSJ dataset and a noise dataset.

  • TequilaGAN—Identifying GAN Samples: This chapter explains how to implement TequilaGAN. We will learn how to identify the underlying characteristics of GAN data and how to identify data to differentiate real data from fake data. We will implement strategies to easily identify fake samples that have been generated with the GAN framework. One strategy is based on the statistical analysis and comparison of raw pixel values and features extracted from them. The other strategy learns formal specifications from real data and shows that fake samples violate the specifications of the real data. We focus on the MNIST dataset of handwritten images, CIFAR-10, and a dataset of Bach Chorales.

  • What’s Next in GANs: This chapter covers recent advances and open questions that relate to GANs. We start with a summary of this course and what it has covered, from the simplest to the state-of-the-art GANs. Then, we address important open questions related to GANs. We also consider the artistic use of GANs in the visual and sonic arts. Finally, we take a look at new and yet-to-be-explored domains with GANs.