Model Design Cheat Sheet

Understand model design cheat sheet and its different parameters.

Now, we will provide an overview of the various choices we can make when it comes to designing the architecture of GAN models and even deep learning models in general. It is always okay to directly borrow the model architectures we see in papers. It is also imperative to know how to adjust a model and create a brand new model from scratch, according to the practical problems at hand. Other factors, such as GPU memory capacity and expected training time, should also be considered when we design our models.

We will talk about the following:

  • Overall model architecture design

  • Choosing a convolution operation method

  • Choosing a downsampling operation method

Overall model architecture design

There are mainly two different design processes for deep learning models. They are fit for different scenarios, and we should get comfortable with both processes:

  • Design the whole network directly, especially for shallow networks. We can add/remove any layer in our network with ease. With this approach, we can easily notice any bottlenecks in our network (for example, which layer needs more/fewer neurons), which is extremely important when we are designing models that will run on mobile devices.

  • Design a small block/cell (containing several layers or operations) and repeat the blocks several times to form the whole network. This process is very popular in very deep networks, especially in network architecture search (NAS). It is a bit harder to spot the weak spot in our model because all we can do is adjust the block, train the whole network for hours, and see if our adjustments lead to higher performance.

U-Net-shaped and ResNet-shaped networks are designed via a block-based approach and use skip connections to connect non-adjacent layers. There are two different forms of data flow in neural networks:

Get hands-on with 1400+ tech skills courses.