What is Generative Adversarial Network (GAN)?

Overview of Generative Adversarial Networks

Generative Adversarial Networks (GANs) are one of the most widely used generative models. Generative models create new data instances that resemble the data points from the training data. In short, GANs help us create content it hasn’t seen before.

Given a training set, a GAN learns to generate new data with the same statistics as the training set. For example, a GAN trained on pictures of human faces can create new photographs of human faces, even though the new faces do not belong to any real person.

Components of GANs

GANs have the following components:

  • Generator
  • Discriminator

Generator

The generator leans to produce the target output. It starts with random noise and then creates images that look like original images from the dataset. These images will be later used in the training of the discriminator. Generators are typically deconvolutional neural networks.

Discriminator

As the name suggests, the discriminator acts like a critic to distinguish between the original images from the training dataset and the images generated by the generator. Discriminators are convolutional neural networks.

Images of human faces created using a GAN
Images of human faces created using a GAN

Work process of GANs

GANs help pair a generator with a discriminator. The discriminator and the generator operate so that the discriminator cannot distinguish the images generated by the generator. In other words, the generator tries to fool the discriminator, and the discriminator tries to keep from being fooled. Initially, we train the discriminator with a known dataset. We present the discriminator with samples from the training dataset until it achieves acceptable accuracy. Then, the generator trains to fool the discriminator. Typically, the generator is fed with randomized input sampled from a predefined latent space. Consider latent space as the random noise. Latent space is useless, but the generator can use samples from this space and build new images. After that, the output generated by the generator is evaluated by the discriminator.

The working of GANs

GANs rely on back-propagation on both networks to minimize the errors. This helps the generator produces better images while the discriminator becomes more skilled at flagging synthetic images.

We represent the generative network as G(z;θg)G(z;\theta_g) where zp(z)z\sim p(z)and p(z)p(z) is the prior defined on the noise distribution of the generative network pgp_g over data xx. On the other hand, the discriminator is represented as D(x;θd) D(x;\theta_d) where θd\theta_d is the parameters for the multilayer perceptron network. The objective function that combines both GG and DD is given below:

This objective function will minimize the error of the generative network to generate more and more accurate images. It maximizes the discriminator so it can learn to distinguish between the images generated by the generator and the images from the training dataset.

An amazing extension to GANs is Conditional GANs. It is a simple yet powerful extension. We apply a condition cc to the inputs of both GG and DD. Many state-of-the-art models use this technique to achieve their application goals.

Applications

There are a number of ways and applications that utilize GANs.

  • Data generation: Several studies use GANs to fulfill data needs.
  • Cartoon generation: GANs are also used to generate new anime characters.
  • Face aging: GANs are used to create images that can give a view of one's old age.
  • Day-to-Night: The images clicked in the daylight can be converted to portray the same scene at night.

State-of-the-art GANs

In this section, we'll discuss some well-known state-of-the-art GAN models.

  • StyleGAN: StyleGAN generates a highly detailed image of 1024×10241024 \times 1024 dimensions. It transforms the image into levels. Building upon the images from the previous level, styleGAN adds more and more details to the image.
  • Pix2Pix: Pix2Pix performs image-to-image translation. An input image is translated into a detailed image that is comparable with the images in the training dataset.

Although originally proposed as a form of a generative model for unsupervised learning, GANs have also proven useful for supervised learning and reinforcement learning.

Read “What is supervised and unsupervised learning?” to learn more.

Copyright ©2024 Educative, Inc. All rights reserved