What is padding and its types in CNN?

Convolutional neural networks (CNNs) transformed computer vision by allowing machines to evaluate visual patterns. An essential element in CNNs is padding, which refers to adding more pixels/values around the input images (data) before applying operations. This Answer delves into the padding, its significance, and types in CNNs.

Significance of padding

Padding in CNN has two essential advantages that are described below:

  1. Preserving spatial information: Padding doesn’t allow for reducing spatial dimensions as the input goes through layers. By keeping this initial spatial size, padding retains essential information at the edges.

  2. Mitigating border effects: Operations performed on the edges (without padding) may lead to misalignment. This leads to unwanted border effects and less focus on edges. Padding addresses this by allowing proper alignment of the filter by introducing extra pixels.

The two primary advantages of padding
The two primary advantages of padding

Types of padding

Primarily, there are four types of paddings, as discussed below:

  • Valid padding (or no padding): This type involves no additional pixels, which reduces spatial dimensions. While it may be efficient computationally, it may reduce information at the edges.

  • Same padding: It involves adding zeros around input data, which means the output spatial dimensions will match the input. This preserves spatial information at the edges.

  • Reflective padding: It involves mirroring values at input edges, which makes a reflection. Addresses border effects by allowing the correct alignment of convolutional filters.

  • Replicate padding: It involves duplicating values at input edges, which reduces border effects by extending the input with replicated border values.

The illustration below displays the types of padding:

The basic types of padding
The basic types of padding

Implementation of padding

Let’s implement the Zero and Valid padding for demonstration purposes to see how they work.

Zero padding

Zero padding, also known as Same padding, adds zeros to the end of an input image. The illustration below depicts how padding is applied to an image’s pixels so that each input image has the exact size:

An example of how Zero padding is applied to an imput image
An example of how Zero padding is applied to an imput image

The following code demonstrates how zero padding is applied through the TensorFlow library:

import tensorflow as tf
model = tf.keras.models.Sequential([
# Convolutional layer
tf.keras.layers.Conv2D(32, (3, 3), padding='same', activation='relu', input_shape=(28, 28, 1)),
# Pooling layer
tf.keras.layers.MaxPooling2D((2, 2)),
# Flatten the output to feed into a dense layer
tf.keras.layers.Flatten(),
# Dense layer
tf.keras.layers.Dense(128, activation='relu'),
# Output layer
tf.keras.layers.Dense(7, activation='softmax') # Assuming 7 classes for classification
])
# Compiling the model
model.compile(optimizer='Adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

The code above demonstrated a CNN classification using the ConV2D layer, which uses the Same padding, meaning that the output layer will have the same spatial dimensions as the input.

Valid padding

Valid padding means that no padding is added to the input. The following code demonstrates how to implement valid padding through the TensorFlow library:

tf.keras.layers.Conv2D(32, (3, 3), padding='valid', activation='relu', input_shape=(28, 28, 1))

Till now we have seen how padding works in images (imagery data). Let's look at how padding is performed in text (text data).

Padding in text processing

Processing sentences can be difficult as they can come in varying lengths. Hence, padding can be applied to the start or end of the text so all input sequences remain of the same length. The illustration below depicts the same concept:

Applying padding at the start of a textual input
Applying padding at the start of a textual input

Note: The padding in text is applied after tokenization and encoding in which sentences are converted to smaller parts and numerical values, respectively.

The pad_sequences function from TensorFlow is used for this purpose:

from tensorflow.keras.preprocessing.sequence import pad_sequences
# A list of tokenized sentences
sequences = [
[1, 2, 3, 4],
[1, 2],
[1, 2, 3, 4, 5, 6]
]
# Pad sequences to the length of 10
padded_sequences = pad_sequences(sequences, padding='post', maxlen=10, value=0)
print(padded_sequences)

Knowledge test

Solve this following quiz to evaluate your understanding of padding:

1

What is the primary role of padding in CNNs?

A)

Reducing spatial dimensions

B)

Enhancing computational efficiency

C)

Preserving spatial information

D)

Mitigating filter misalignment

Question 1 of 30 attempted

Free Resources

Copyright ©2024 Educative, Inc. All rights reserved