...

/

Creating the Network from TensorFlow 2

Creating the Network from TensorFlow 2

Learn how to prepare and build the variational autoencoder model.

We'll cover the following...

Now that we’ve downloaded the CIFAR-10 dataset, split it into test and training data, and reshaped and rescaled it, we are ready to build our VAE model. We’ll use the same Model API from the Keras module in TensorFlow 2. The TensorFlow documentation contains an example of how to implement a VAE using convolutional networks, and we’ll build on this code example. However, for our purposes, we’ll implement simpler VAE networks using MLP layers based on the original VAE paper, “Auto-encoding Variational BayesKingma, Diederik P, and Max Welling. 2013. “Auto-Encoding Variational Bayes.” ArXiv.org. December 20, 2013. https://arxiv.org/abs/1312.6114.,” and show how we adapt the TensorFlow example to also allow for IAF modules in decoding.

In the original article, the authors propose two kinds of models for use in the VAE, both MLP feedforward networks: Gaussian and Bernoulli. These names reflect the probability distribution functions used in the MLP network outputs in their final layers.

Bernoulli MLP

The Bernoulli MLP can be used as the network decoder, generating the simulated image xx from the latent vector zz. The formula for the Bernoulli MLP is:

Where the first line is the cross-entropy function we use to determine if the network generates an approximation of the original image in reconstruction, while yy is a feedforward network with two layers: a tanh\tan{h} transformation followed by a sigmoidal function to scale the output between 00 and 11. Recall that this scaling is why we had to normalize the CIFAR-10 pixels from their original values.

We can easily create this Bernoulli MLP network using the Keras API:

Press + to interact
class BernoulliMLP(tf.keras.Model):
def __init__(self, input_shape, name='BernoulliMLP', hidden_dim=10, latent_dim=10, **kwargs):
super().__init__(name=name, **kwargs)
self._h = tf.keras.layers.Dense(hidden_dim, activation='tanh')
self._y = tf.keras.layers.Dense(latent_dim, activation='sigmoid')
def call(self, x):
return self._y(self._h(x)), None, None
  • Line 4: We define a dense layer with hidden_dim neurons and a hyperbolic tangent (tanh) activation function.

  • Line 5: We define the output layer with latent_dim neurons and sigmoid activation function.

  • Lines 7–8: We define the call method which implements the forward pass of the model. It takes an input x and passes it through the hidden layer _h followed by the output layer _y. It returns three values:

    • The output of the output layer _y, representing the reconstructed data.

    • None, indicating that no mean vector is produced.

    • None, indicating that no log variance vector is produced.

We just need to specify the dimensions of the single hidden layer and the latent output z. We then specify the forward pass as a composition of these two layers. This is because, in our end model, we could use either the BernoulliMLP or GaussianMLP as the decoder. If we used the GaussianMLP, we return three values, as we'll see below. We utilize a binary output and cross-entropy loss, so we can use just the single output, but we want the return signatures for the two decoders to match.

Gaussian MLP

The second network type proposed by the authors in the original VAE paper was a Gaussian MLP, whose formulas are as follows:

This network can be used as either the encoder (generating the latent vector zz) or the decoder (generating the simulated image xx) in the network. The equations above assume that it is used as the decoder, and for the encoder, we just switch the xx and zz variables. As you can see, this network has two types of layers, a hidden layer given by a tanh\tan{h} transformation of the input, and two output layers, each given by linear transformations of the hidden ...