Understanding CNNs: Pooling Operations
Learn about CNN pooling operations.
We'll cover the following
Pooling operation
The pooling operation, which is sometimes known as the subsampling operation, was introduced to CNNs mainly for reducing the size of the intermediate outputs and for making the CNN invariant to small translations in the input. This is preferred over the natural dimensionality reduction caused by convolution without padding because we can decide where to reduce the size of the output with the pooling layer, in contrast to forcing it to happen every time. Forcing the dimensionality to decrease without padding would strictly limit the number of layers we can have in our CNN models.
We define the pooling operation mathematically in the following sections. More precisely, we’ll discuss two types of pooling: max pooling and average pooling. First, however, we’ll define the notation. For an input of size
Max pooling
The max pooling operation picks the maximum element within the defined kernel of an input to produce the output. The max pooling operation shifts windows over the input (the middle squares in the figure below) and take the maximum at each time. Mathematically, we define the pooling equation as follows:
Get hands-on with 1400+ tech skills courses.