Generative AI with Python and TensorFlow 2/

...

Applications of AI

Learn about the different uses of generative AI.

We'll cover the following...

Discriminative and generative models
Implementing generative models

In New York City in October 2018, the international auction house Christie’s sold the Portrait of Edmond Belamy“A Collaboration between Two Artists One Human One a Machine 9332.” 2018. Christies.com. Christie’s. 2018. https://www.christies.com/en/stories/a-collaboration-between-two-artists-one-human-one-a-machine-0cd01f4e232f4279a525a446d60d4cd1. during the show Prints & Multiples for $432,500.00. This sale was remarkable both because the sale price was 45 times higher than the initial estimates for the piece and due to the unusual origin of this portrait. Unlike the majority of other artworks sold by Christie’s since the 18^th century, the Portrait of Edmond Belamy is not painted using oil or watercolors, nor is its creator even human; rather, it is an entirely digital image produced by a sophisticated machine learning algorithm. The creators—a Paris-based collective named Obvious—used a collection of 15,000 portraits created between the 14^th and 20^th centuries to tune an artificial neural network model capable of generating aesthetically similar, albeit synthetic, images.

Press + to interact

Portraiture (which is the process of painting a picture or taking a photograph of a person) is far from the only area in which machine learning has demonstrated astonishing results. Indeed, if you paid attention to the news in the last few years, you would have likely seen many stories about the ground-breaking results of modern AI systems applied to diverse problems, from the hard sciences to digital art. Deep neural network models, such as the one created by Obvious, can now classify X-ray images of human anatomy on the level of trained physiciansBaltruschat, Ivo M., Hannes Nickisch, Michael Grass, Tobias Knopp, and Axel Saalbach. 2019. “Comparison of Deep Learning Approaches for Multi-Label Chest X-Ray Classification.” Scientific Reports 9 (1). https://doi.org/10.1038/s41598-019-42294-8., defeat expert players at both classic board games such as Go (an Asian game similar to chess“AlphaGo.” n.d. Google DeepMind. https://deepmind.google/technologies/alphago/.) and multiplayer computer games“AlphaStar: Grandmaster Level in StarCraft II Using Multi-Agent Reinforcement Learning.” 2019. Google DeepMind. October 30, 2019. , and translate French into English with amazing sensitivity to grammatical nuancesDevlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. “BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding.” ArXiv.org. October 11, 2018. https://arxiv.org/abs/1810.04805..

Discriminative and generative models

These other examples of AI differ in an important way from the model that generated the Portrait of Edmond Belamy. In all of these other applications, the model is presented with a set of inputs—data such as English text, images from X-rays, or the positions on a gameboard—that is paired with a target output, such as the next word in a translated sentence, the diagnostic classification of an X-ray, or the next move in a game. Indeed, this is probably the kind of AI model you are most familiar with from prior experiences of predictive modeling—a statistical technique to predict future behavior; they are broadly known as discriminative models, whose purpose is to create a mapping between a set of input variables and a target output. The target output could be a set of discrete classes (such as which word in the English language appears next in a translation) or a continuous outcome (such as the expected amount of money a customer will spend in an online store over the next 12 months).

It should be noted that this kind of model, in which data is labeled or scored, represents only half the capabilities of modern machine learning. Another class of algorithms, such as the one that generated the artificial portrait sold at Christie’s, doesn’t compute a score or label from input variables but rather generates new data. Unlike discriminative models, the input variables are often vectors of numbers that aren’t related to real-world values at all and are often even randomly generated. This kind of model—known as a generative model—can produce complex outputs such as text, music, or images from random noise and is the topic of this course.

Even if you didn’t know it at the time, you have probably seen other instances of generative models in the news alongside the discriminative examples given earlier. A prominent example is deepfakes, which are videos in which one person’s face has been systematically replaced with another’s by using a neural network to remap the pixelsDeepfake Detection Game. (2019, December 11). SeanBMcGregor.com. https://seanbmcgregor.com/DeepfakeDetectionGame.html.

Let’s look at an example of a deepfake imageRadford, Alec. 2019. “Better Language Models and Their Implications.” OpenAI. OpenAI. February 14, 2019. https://openai.com/blog/better-language-models/.:

Press + to interact

You may also have seen stories about AI models that generate fake news. Scientists at the firm OpenAI were initially terrified to release these models to the public due to concerns they could be used to create propaganda and misinformation online“Fake Facebook Conversations Using OpenAI GPT-2.” n.d. Devopstar.com. https://devopstar.com/2019/03/18/fake-facebook-conversations-using-openai-gpt-2/..

Here is an example of a fake conversation generated using GPT-2“Google Duplex: An AI System for Accomplishing Real-World Tasks over the Phone.” 2018. Blog.research.google. May 8, 2018. https://blog.research.google/2018/05/duplex-ai-system-for-natural-conversation.html.:

Press + to interact

In these and other applications, such as Google’s voice assistant Duplex, which can make a restaurant reservation by dynamically creating a conversation with a human in real-time“MuseGAN.” 2020. MuseGAN. 2020. https://salu133445.github.io/musegan/., or software that can generate original musical compositions‌Isakov, Stanislav. 2018. “Twin-GAN: Cross-Domain Translation of Human Portraits.” June 25, 2018. https://neurohive.io/en/state-of-the-art/twin-gan-cross-domain-translation-of-human-portraits/., we are surrounded by the outputs of generative AI algorithms.

Here are some examples of style transfer using generative adversarial networks (GANsKolmogorov, A N. 2019. Foundations of the Theory of Probability. American Mathematical Soc.).

Press + to interact

These models are able to handle complex information in a variety of domains, including creating photorealistic images or stylistic filters on pictures, synthetic sound, conversational text, and even rules for optimally playing video games.

Implementing generative models

While generative models could theoretically be implemented using a wide variety of machine learning algorithms, in practice, they are usually built with deep neural networks, which are well suited to capturing complex variations in data such as images or language.

We’ll focus on implementing these deep generative models for many different applications using TensorFlow 2. TensorFlow 2 is a C++ framework with APIs in the Python programming language used to develop and produce deep learning models. It was open-sourced by Google in 2013 and has become one of the most popular libraries for the research and deployment of neural network models.

With the 2.0 release, much of the boilerplate code—sections of code that are repeated in multiple places with little to no variation—that characterized development in earlier versions of the library was cleaned up with high-level abstractions, allowing us to focus on the model rather than the plumbing of the computations. The latest version also introduced the concept of eager execution, allowing the network computations to be run on demand, which will be an important benefit of implementing some of our models.

Indeed, as we will describe in more detail later, the surge of research into deep learning using large neural network models since 2006 has produced a wide variety of generative modeling applications The first of these was the restricted Boltzmann machine, which is stacked in multiple layers to create a deep belief network. Later innovations included variational autoencoders (VAEs), which can efficiently generate complex data samples from random numbers.

We’ll also cover in detail the algorithm used to create the Portrait of Edmond Belamy, the GAN. Conceptually, the GAN model creates a competition between two neural networks.

One, termed the generator, produces realistic (or, in the case of the experiments by Obvious, artistic) images starting from a set of random numbers and applying a mathematical transformation.
The second network, known as the discriminator, attempts to classify whether a picture comes from a set of real-world images or whether it was created by the generator.

In a sense, the generator is like an art student, producing new paintings from brushstrokes and creative inspiration, while the discriminator acts like a teacher, grading whether the student has produced work comparable to the paintings they are attempting to mimic. As the generator becomes better at fooling the discriminator, its output becomes closer and closer to the historical examples it is designed to copy.

Press + to interact

Another key innovation in generative models is in the domain of natural language data. By representing the complex interrelationship between words in a sentence in a computationally scalable way, the transformer network and the Bidirectional Encoder from Transformers (BERT) model built on top of it present powerful building blocks to generate textual data in applications such as chatbots.

We’ll also learn how models such as GANs and VAEs can be used to generate not just images or text but sets of rules that allow game-playing networks developed with reinforcement learning algorithms to process and navigate their environment more efficiently—in essence, learning to learn. Generative models are a huge field of research that is constantly growing, so we’ll cover some topics one by one.

Introduction to the Course

An Introduction to Generative AI

Building Blocks of Deep Neural Networks

Teaching Networks to Generate Digits

Painting Pictures with Neural Networks Using VAEs

Recognize Handwritten Digits Using a Deep Neural Network

Image Generation with GANs

Dataset Augmentation with GANs

Style Transfer with GANs

Assessment: Introduction to Generative AI to Style Transfer

Deepfakes with GANs

The Rise of Methods for Text Generation

Exploring OpenAI API

NLP 2.0: Using Transformers to Generate Text

Composing Music with Generative Models

Generating New Music with Artificial Intelligence

Play Video Games with Generative AI: GAIL

Emerging Applications in Generative AI

Assessment: Deepfakes using GANs to Emerging Applications

Conclusion

Appendix

Applications of AI

Discriminative and generative models

Implementing generative models