Improving the Baseline Model

Improve the baseline model using the matching-aware discriminator approach in GANs.

In this example, we improve the baseline model without making any modifications to the architecture. The authors propose changing the optimization problem such that the discriminator also has access to mismatched pairs of text embeddings and images. This approach is called the matching-aware discriminator and is designed to separate the error sources in this task.

Training enhancement

During training, the discriminator has access to real images with proper text and synthetic images with arbitrary text. In this context, the discriminator implicitly has two sources of error: fake images that look real but do not match the text description and unrealistic images for any text.

Empirical validation

In this context, the authors explicitly provide the discriminator with pairs of real images and unmatched texts and empirically find that this helps during training. We’ll provide a slice of the training loop with the necessary modifications to perform the matching-aware approach.

Training

To train the model, we perform the following steps:

  1. We build the model output, including a variable for the shuffled input:

Get hands-on with 1400+ tech skills courses.