B. Identity mapping

We've previously discussed the issue of degradation, where a model's performance decreases after a certain number of layers are added. Despite its effects, degradation is actually a pretty counter-intuitive problem.

Let's say our model performs well at 40 convolution layers, so we want to add 20 more convolution layers (60 layers total). There is a simple way for the larger model to achieve the same performance as the smaller model: use the same weights as the smaller model for the first 40 layers, then make the last 20 convolution layers an identity mapping.

An identity mapping just means the output for a layer (or set of layers, e.g. a ResNet block) is the same as the input. So if the larger model used the same weights as the smaller model, followed by an identity mapping, its outputs would be identical to those of the smaller model.

What you'll learn from this course

Image Processing

CNN

SqueezeNet

ResNet

Shortcut

Chapter Goals:

A. Mapping functions

B. Identity mapping