Pooling operation

The pooling operation, which is sometimes known as the subsampling operation, was introduced to CNNs mainly for reducing the size of the intermediate outputs and for making the CNN invariant to small translations in the input. This is preferred over the natural dimensionality reduction caused by convolution without padding because we can decide where to reduce the size of the output with the pooling layer, in contrast to forcing it to happen every time. Forcing the dimensionality to decrease without padding would strictly limit the number of layers we can have in our CNN models.

We define the pooling operation mathematically in the following sections. More precisely, we’ll discuss two types of pooling: max pooling and average pooling. First, however, we’ll define the notation. For an input of size $n \times n$ and a kernel (analogous to the filter of a convolution layer) of size $m \times m$ , where $n \geq m$ , the convolution operation slides the patch of weights over the input. Let’s denote the input by $X$ , the patch of weights by ...

Introduction to Natural Language Processing

Understanding TensorFlow 2

Word2vec: Learning Word Embeddings

Advanced Word Vector Algorithms

Sentence Classification with Convolutional Neural Networks

Recurrent Neural Networks

Understanding Long Short-Term Memory Networks

Applications of LSTM: Generating Text

Sequence-to-Sequence Learning: Neural Machine Translation

Transformers

Sarcasm Classification Using BERT

Image Captioning with Transformers

Caption Generation Using PyTorch

Final Remarks

Appendix: Mathematical Foundations and Advanced TensorFlow

Understanding CNNs: Pooling Operations

Pooling operation