Max Pooling

Understand how padding layers can improve a CNN's performance.

Chapter Goals:

  • Learn about max pooling and its purpose in CNNs
  • Apply max pooling to the output of the convolution layer

A. Purpose of pooling

While the convolution layer extracts important hidden features, the number of features can still be pretty large. We can use pooling to reduce the size of the data in the height and width dimensions. This allows the model to perform fewer computations and ultimately train faster. It also prevents overfitting, by extracting only the most salient features and ignoring potential distortions or uncommon features found in only a few examples.

B. How pooling works

Similar to a convolution, we use filter matrices in pooling. However, the pooling filter doesn't have any weights, nor does it perform matrix dot products. Instead, it applies a reduction operation to subsections of the input data.

The type of pooling that is usually used in CNNs is referred to as max pooling. The filters of max pooling use the max operation to obtain the maximum number in each submatrix of the input data. An example of max pooling is shown below:

Max pooling with a 2x2 filter and stride size of 1. Note that the output data has reduced dimensions compared to the input data.
Max pooling with a 2x2 filter and stride size of 1. Note that the output data has reduced dimensions compared to the input data.

Other types of pooling include min pooling and average pooling, which use the min and average operations, respectively, over subsections of the input data.

For input data with dimensions Hin×Win\small H_{in} \times W_{in} the output of pooling with filter dimensions HF×WF\small H_F \times W_F ...