Padding, Strides, and Dilation

Learn about the padding, strides, and dilation concepts in convolutional neural networks (CNNs) and the purpose of 1x1 convolutions in optimizing deep network architectures for effective feature extraction and dimensionality reduction.

We'll cover the following...

Padding

In the previous lesson, convolution was illustrated using an arbitrary input and kernel of sizes 5×55 × 5 and 3×53 × 5, respectively. It was observed that the 5×55 × 5 input was reduced to a 3×13 × 1 output. Consider a network with several such layers. At every layer, the features size will shrink considerably.

The shrinking of the feature map is not a problem if the network is shallow or the input is high-dimensional. The feature reduction (regularisation) is, in fact, beneficial as it reduces the dimension.

However, a rapid reduction in the feature map size prohibits the addition of any subsequent layer or makes the higher-level layers futile. This is because we can quickly run out of the features available for the next convolutional layer. This issue easily becomes an impediment to the construction of deep networks.

Note: Feature maps shrink considerably in traditional convolution. It becomes an issue in deep networks as we quickly run out of features.

The issue can be resolved with a padded convolutional layer. In padding, the size of the input sample is artificially increased to augment the outputted features map size.

Padding was originally designed to make the feature map size equal to the size of the input. It is done by appending 00’s on the periphery of input. The illustration below shows the padding of the 5×55 \times 5 ...