Implementation of pix2pix

Understand how to implement the pix2pix model effectively, incorporating a PatchGAN discriminator and a U-Net generator with skip connections to enhance image synthesis.

We'll cover the following

The implementation of pix2pix has two interesting novelties with respect to the previous lessons, including a PatchGAN discriminator and the U-Net architecture. Unlike other discriminator architectures, the PatchGAN discriminator outputs multiple values per image. The U-Net architecture is used on the generator and includes a skip connection between layers of the generator that are increasingly far from each other.

The layers used in pix2pix models can be summarized in two blocks: one block is the encoding block, which is a stack of a convolution with batch normalization and a Leaky ReLU nonlinearity; the other block is the decoding block, which is a stack of upsampled convolutions with batch normalization and ReLU nonlinearity. The decoding block has a skip connection that concatenates a skip input to the layers’ output.

The imports used in modeling pix2pix do not offer us any novelty. Let’s skip them this time and look at the custom layers that are designed to make the code more readable.

Custom layers

In pix2pix implementation, there are many code blocks that are repeated. So we wrap such blocks into individual methods for ease of use. As we described earlier, we will need an encoding block and a decoding block. The novelty is in the skip input in the decoding block that concatenates the layer output with the layer input:

Get hands-on with 1200+ tech skills courses.