Introduction to the Course

Get a brief introduction to the course.

With continuously evolving research and development, generative adversarial networks (GANs) are the next big thing in deep learning. This course highlights the key improvements in GANs over traditional generative models and shows us how to make the best out of GANs with the help of hands-on examples.

This course will help understand how GAN architecture works using PyTorch. We will get familiar with the most flexible deep learning toolkit and use it to transform ideas into actual working code. We will apply GAN models to areas such as computer vision, multimedia, and natural language processing using a sample-generation methodology.

Press + to interact

We will apply GAN models to areas such as computer vision, multimedia, and natural language processing using a sample-generation methodology.

Target audience

This course is for machine learning practitioners and deep learning researchers looking to get hands-on guidance on implementing GAN models using PyTorch.

Prerequisites

Following are the prerequisites for this course:

  • Basic knowledge of Python and PyTorch.

  • A rudimentary understanding of GANs and deep learning will be helpful, but it is by no means necessary.

Course contents

The course is divided into following sections:

  1. Generative Adversarial Networks Fundamentals: This section introduces the new features of PyTorch. You will also learn how to build a simple GAN with NumPy to generate sine signals.

  2. Best Practices in Model Design and Training: This section looks at the overall design of the model architecture and the steps that need to be followed to choose the required convolutional operation.

  3. Building Our First GAN with PyTorch: This section introduces us to a classic and well-performing GAN model, called DCGAN, for generating 2D images. You will also be introduced to the architecture of DCGANs and learn how to train and evaluate them. Following this, You will learn how to use a DCGAN to generate hand-written digits and human faces and take a look at adversarial learning with auto-encoders. You will also be shown how to efficiently organize our source code for easy adjustments and extensions.

  4. Generating Images Based on Label Information: This section shows how to use a CGAN to generate images based on a given label and how to implement adversarial learning with auto-encoders.

  5. Image-to-Image Translation and Its Applications: This section shows how to use pixel-wise label information to perform image-to-image translation with pix2pix and how to translate high-resolution images with pix2pixHD. You will also learn how to flexibly design model architectures to accomplish your goals, including generating larger images and transferring textures between different types of images.

  6. Image Restoration with GANs: This section shows us how to perform image super-resolution with SRGAN to generate high-resolution images from low-resolution ones and how to use a data prefetcher to speed up data loading and increase our GPU’s efficiency during training. You will also learn how to train a GAN model to perform images in painting and fill in the missing parts of an image.

  7. Training Our GANs to Break Different Models: This section looks into the fundamentals of adversarial examples and how to attack and confuse a CNN model with FGSM (Fast Gradient Sign Method). After this, You will look at how to use an accimage library to speed up your image loading even more and train a GAN model to generate adversarial examples and fool the image classifier.

  8. Image Generation from Description Text: This section provides basic knowledge of word embeddings and how they are used in the NLP field. You will also learn how to design a text-to-image GAN model to generate images based on one sentence of description text.

  9. Sequence Synthesis with GANs: This section covers commonly used techniques in the NLP field, such as RNN and LSTM. You will also learn some of the basic concepts of reinforcement learning and see how it differs from supervised learning (such as SGD-based CNNs). You will also learn how to use SEGAN to remove background noise and enhance the quality of speech audio.

  10. Reconstructing 3D Models with GANs: This section shows how 3D objects are represented in computer graphics (CG). You will also look at the fundamental concepts of CG, including camera and projection matrices. You will then learn how to construct a 3D-GAN model with 3D convolutions and train it to generate 3D objects.