Generative AI with Python and TensorFlow 2/

...

/

Creating the Network from TensorFlow 2

Now that we’ve downloaded the CIFAR-10 dataset, split it into test and training data, and reshaped and rescaled it, we are ready to build our VAE model. We’ll use the same Model API from the Keras module in TensorFlow 2. The TensorFlow documentation contains an example of how to implement a VAE using convolutional networks, and we’ll build on this code example. However, for our purposes, we’ll implement simpler VAE networks using MLP layers based on the original VAE paper, “Auto-encoding Variational BayesKingma, Diederik P, and Max Welling. 2013. “Auto-Encoding Variational Bayes.” ArXiv.org. December 20, 2013. https://arxiv.org/abs/1312.6114.,” and show how we adapt the TensorFlow example to also allow for IAF modules in decoding.

In the original article, the authors propose two kinds of models for use in the VAE, both MLP feedforward networks: Gaussian and Bernoulli. These names reflect the probability distribution functions used in the MLP network outputs in their final layers.

Bernoulli MLP

The Bernoulli MLP can be used as the network decoder, generating the simulated image $x$ from the latent vector $z$ . The formula for the Bernoulli MLP is:

Line 4: We define a dense layer with hidden_dim neurons and a hyperbolic tangent (tanh) activation function.
Line 5: We define the output layer with latent_dim neurons and sigmoid activation function.
Lines 7–8: We define the call method which implements the forward pass of the model. It takes an input x and passes it through the hidden layer _h followed by the output layer _y. It returns three values:
- The output of the output layer _y, representing the reconstructed data.
- None, indicating that no mean vector is produced.
- None, indicating that no log variance vector is produced.

We just need to specify the dimensions of the single hidden layer and the latent output z. We then specify the forward pass as a composition of these two layers. This is because, in our end model, we could use either the BernoulliMLP or GaussianMLP as the decoder. If we used the GaussianMLP, we return three values, as we'll see below. We utilize a binary output and cross-entropy loss, so we can use just the single output, but we want the return signatures for the two decoders to match.

Gaussian MLP

The second network type proposed by the authors in the original VAE paper was a Gaussian MLP, whose formulas are as follows:

This network can be used as either the encoder (generating the latent vector $z$ ) or the decoder (generating the simulated image $x$ ) in the network. The equations above assume that it is used as the decoder, and for the encoder, we just switch the $x$ and $z$ variables. As you can see, this network has two types of layers, a hidden layer given by a $\tan{h}$ transformation of the input, and two output layers, each given by linear transformations of the hidden layer, which are used as the inputs of a lognormal ...

Introduction to the Course

An Introduction to Generative AI

Building Blocks of Deep Neural Networks

Teaching Networks to Generate Digits

Painting Pictures with Neural Networks Using VAEs

Recognize Handwritten Digits Using a Deep Neural Network

Image Generation with GANs

Dataset Augmentation with GANs

Style Transfer with GANs

Assessment: Introduction to Generative AI to Style Transfer

Deepfakes with GANs

The Rise of Methods for Text Generation

NLP 2.0: Using Transformers to Generate Text

Composing Music with Generative Models

Generating New Music with Artificial Intelligence

Play Video Games with Generative AI: GAIL

Emerging Applications in Generative AI

Assessment: Deepfakes using GANs to Emerging Applications

Conclusion

Appendix

Creating the Network from TensorFlow 2

Bernoulli MLP

Gaussian MLP