Convolutional Neural Networks involve downsampling via convolution to reduce the dimensions of the input image to derive meaningful feature details from the input image.
On the contrary, upsampling increases the dimensions of any particular input to match the output dimensions as per the requirements.
Transposed convolutions are one of the upsampling techniques employed in convolutional neural networks.
The major steps involved in the process are described below.
It's necessary to ensure that the output dimensions are larger than the input dimensions.
For instance, an example would be an input space of 2x2 and an output space of 3x3.
Kernel size can be varied according to the level of detail we want to preserve in the upsampling process.
Note: Large kernel dimensions would prevent
but also minimize the feature details. On the other hand, kernel sizes that are too small would focus closely on the details but the model would be inclined towards overfitting. overfitting A state in which a trained model is finely trained according to the trends of the training dataset but is not general enough to work efficiently on unseen data instances.
Stride refers to the step size that we wish to take during convolution. A unit stride would mean that we progress the kernel with unit-sized increments over the input space.
Taking an input space of dimensions 2x2, an output space of dimensions 3x3, kernel dimensions of 2x2, and a unit stride, let's look at an example:
It's worth noting that overlapping spaces are added up as a part of the convolution process.