Why Use Generative Models?

Understand the motivation behind using generative models.

Why do we have a need for generative models in the first place? What value do they provide in practical applications? To answer this question, let’s take a brief discuss the topics that form the basis of generation using deep learning.

The promise of deep learning

Many of the models we’ll survey in the course are deep, multi-level neural networks. The last 15 years have seen a renaissance in the development of deep learning models for image classification, natural language processing and understanding, and reinforcement learning. These advances were enabled by breakthroughs in traditional challenges in tuning and optimizing very complex models, combined with access to larger datasets, distributed computational power in the cloud, and frameworks such as TensorFlow that make it easier to prototype and reproduce research.

Building a better digit classifier

A classic problem used to benchmark algorithms in machine learning and computer vision is the task of classifying which handwritten digit from 090-9 is represented in a pixelated image from the MNIST datasetLeCun, Yann . 2009. “MNIST Handwritten Digit Database, Yann LeCun, Corinna Cortes and Chris Burges.” Lecun.com. 2009. http://yann.lecun.com/exdb/mnist/.. A major breakthrough in this problem occurred in 2006 when researchers at the University of Toronto and the National University of Singapore discovered a way to train deep neural networks to perform this taskG. Hinton, S. Osindero, & Y.-W. Teh. (2005). A Fast Learning Algorithm for Deep Belief Nets. www.cs.toronto.edu/~fritz/absps/ncfast.pdf .

One of their critical observations was that instead of training a network to directly predict the most likely digit (YY) given an image (XX), it was more effective to first train a network that could generate images and then classify them as a second step. This model was improved upon past attempts to create what is known as the restricted Boltzmann machine and deep Boltzmann machine models that can generate new MNIST digit images.

Generating images

A challenge to generating images such as the Portrait of Edmond Belamy with the approach used for the MNIST dataset is that frequently, images have no labels (such as a digit); rather, we want to map the space of random numbers into a set of artificial images using a latent vector, ZZ.

A further constraint is that we want to promote the diversity of these images. If we input numbers within a certain range, we would like to know that they generate different outputs and be able to tune the resulting image features. For this purpose, variational autoencoders (VAEs) were developed to generate diverse and photorealistic images.

Here are some sample images from a VAE2023. Jaan.io. 2023. https://jaan.io/images/variational-autoencoder-faces.jpg.2023. Medium.com. 2023. https://miro.medium.com/v2/1%2AjcCjbdnN4uEowuHfBoqITQ.jpeg.:

Get hands-on with 1400+ tech skills courses.