Quantitative Methods

Learn about the quantitative GAN assessment, which involves objective functions measuring player performance, image quality, diversity, and classification.

The objective function used in GANs is a quantitative measure that provides information about each player’s performance, discriminator, and generator in the GAN game. For example, in the first GAN objective function, the output of the discriminator on fake data provides information about how well the generator is fooling the discriminator and how well the discriminator can identify real data. Although this information is useful because it provides information about the status of the minimax game and how close it is to equilibrium, it provides absolutely no information about the images themselves.

In this context, researchers in the GAN community have been developing quantitative methods that can be used to measure image quality, variety, and satisfaction with specifications.

In this section, we will address a few of such measures, including the following:

  • The inception score

  • The Frechét inception score

  • Precision, recall, and the F1 score

The inception score

The inception score is a heuristics-based method to measure the quality and diversity of fake images. The score uses a pretrained neural network for image classification, called the inception network. It was first proposed in the paper, “Improved Techniques for Training GANs.”

The main novelties in inception networks are the addition of 1×11\times1 convolution layers and the global average pooling layers with multiple convolutions (1×11\times1, 3×33\times3, and 5×55\times5), and a 3×33\times3 max pooling layer.

Whereas the 1×11\times1 convolution layers are a weighted combination of all the input channels at the current layer, the global average pooling layer improves the robustness of spatial translation and replaces dense layers, thus decreasing the number of parameters in the model but achieving better results.

To compute the inception score, we generate thousands of fake images, feed them through the inception network that was trained for image classification, and analyze the softmax logits output by the network.

  • Image quality: Under the assumption that the inception network will have low entropy outputs when provided with images that come from the true distribution of the data it was trained on, e.g., unimodal output showing high probability for a single class, the entropy of the outputs of the inception network on fake data can be used as a proxy for fake data image quality.

  • Image variety: Under the assumption that the fake samples come from the same true distribution of the data that the inception network was trained on and that the inception network has a high level of accuracy in its predictions, we can compare the distribution of labels on the real data to the distribution of labels on the fake data to evaluate class variety. Note that this does not imply that there is variety within each class.

These two criteria are combined, and the inception score is represented in the following equation:

Get hands-on with 1200+ tech skills courses.