Implementing Vanilla GAN using Keras

Generative Adversarial Networks (GANs) have become an increasingly famous topic in AI due to their ability to generate high-quality data across various domains, from images to music and beyond. Among the different GAN variants, vanilla GAN stands out as the fundamental architecture on which many other GANs are built. Here, we will explore the workings of vanilla GAN and implement it from scratch.

Vanilla GAN architecture

At its core, a vanilla GAN consists of two neural networks: a generator $G$ and a discriminator $D$ . The figure below shows a simplified workflow of the vanilla GAN.

The generator aims to generate synthetic data samples that resemble the real data $\sim p_{data}(x)$ , while the discriminator tries to differentiate between the real $x$ and fake $G(z)$ samples. These two networks engage in a min-max game, where the generator aims to fool the discriminator by producing realistic samples, and the discriminator aims to differentiate between real and fake samples accurately.

Implementation of vanilla GAN

Here is a step-by-step implementation of vanilla GAN:

Import the necessary libraries for creating and visualizing the GAN:

def build_generator(noise_shape, mnist_shape):
    # Define input layer for the generator model
    noise = Input(shape=noise_shape)
    # Fully connected layer with 256 units
    x = Dense(256, input_shape=(noise_shape))(noise)
    x = LeakyReLU(alpha=0.2)(x)
    x = BatchNormalization(momentum=0.8)(x)
    # Fully connected layer with 512 units
    x = Dense(512)(x)
    x = LeakyReLU(alpha=0.2)(x)
    x = BatchNormalization(momentum=0.8)(x)
    # Fully connected layer with 1024 units
    x = Dense(1024)(x)
    x = LeakyReLU(alpha=0.2)(x)
    x = BatchNormalization(momentum=0.8)(x)
    # Output layer with units equal to the number of features in the output image
    # Use tanh activation to ensure pixel values are in the range [-1, 1]
    x = Dense(np.prod(mnist_shape), activation='tanh')(x)
    # Reshape output to match the desired image shape
    x = Reshape(mnist_shape)(x)
    # Create the generator model
    model = Model(noise, x)
    # Forward pass through the model
    img = model(noise)
    # Return both the generator model and its output
    return Model(noise, img)
# Create generator model
# Call build_generator function to create generator model
generator = build_generator(noise_shape, mnist_shape)
generator.summary()

def build_discriminator(mnist_shape):
    # Define input layer for the discriminator model
    input_img = Input(shape=mnist_shape)
    # Flatten the input image
    x = Flatten()(input_img)
    # First fully connected layer with 512 units
    x = Dense(512)(x)
    x = LeakyReLU(alpha=0.2)(x)
    # Second fully connected layer with 256 units
    x = Dense(256)(x)
    x = LeakyReLU(alpha=0.2)(x)
    # Output layer with 1 unit and sigmoid activation for binary classification
    x = Dense(1, activation='sigmoid')(x)
    # Create the discriminator model
    model = Model(input_img, x)
    # Forward pass through the model
    img = model(input_img)
    # Return both the discriminator model and its output
    return model, img
# Create discriminator model
# Call build_discriminator function to create discriminator model and its output
discriminator, disc_output = build_discriminator(mnist_shape)
discriminator.summary()

# Define input layer for noise
input = Input(shape=noise_shape)
# Generate image from noise using generator G
image = G(input)
# Freeze discriminator's weights during training of combined model
D.trainable = False
# Perform classification on generated image using discriminator D
image = D(image)
# Combine generator G and discriminator D into a single model
# This model takes noise as input and outputs the classification result of the generated image
D_G_model = Model(input, image)
# Compile the combined model
D_G_model.compile(optimizer=Adam(0.0002, 0.5), loss='binary_crossentropy')
D_G_model.summary()

# Load the MNIST dataset and extract the training images (X_train)
(X_train, _), (_,_) = mnist.load_data()
X_train.shape
# Center and normalize the pixel values of the images
# Convert pixel values to the range [-1, 1] for improved training stability
X_train = (X_train.astype('float32') - 127.5) / 127.5
# Expand the dimensions of the training data to include a channel dimension (for convolutional layers)
X_train = np.expand_dims(X_train, axis=3)
# The mean and standard deviation of the training data
# Useful for verifying that the data is properly centered and normalized
print(np.mean(X_train), np.std(X_train))
# This noise will be used to generate fake images
noise_shape = (100,)
# Train the GAN
for epoch in range(epochs):
    # Train the discriminator
    
    # Sample real images from the training data
    indices = np.random.randint(0, X_train.shape[0], half_batch)
    images = X_train[indices]
    
    # Train the discriminator on real images
    d_real_loss = D.train_on_batch(images, np.ones((half_batch, 1)))
    # Generate fake images using random noise as input to the generator
    noise = np.random.uniform(0, 1, (half_batch, noise_shape[0]))
    noise_images = G.predict(noise)
    
    # Train the discriminator on fake images
    d_fake_loss = D.train_on_batch(noise_images, np.zeros((half_batch, 1)))
    # Compute the average discriminator loss
    d_loss = np.add(d_real_loss, d_fake_loss) / 2
    # Train the generator
    
    # Generate noise for the full batch
    noise = np.random.uniform(0, 1, (batch_size, noise_shape[0]))
    
    # Train the combined model (generator + discriminator) on noise
    g_loss = D_G_model.train_on_batch(noise, np.ones((batch_size, 1)))
    if epoch % save_every == 0:
        print('Epoch: {}, D_Loss:{}, D_Acc:{}, G_Loss:{}'.format(epoch, d_loss[0], d_loss[1], g_loss))

Free Resources

Learn in-demand tech skills in half the time

PRODUCTS

Mock Interview

New

Courses

Skill Paths

Projects

Assessments

Implementing Vanilla GAN using Keras

Vanilla GAN architecture

Implementation of vanilla GAN

Conclusion