What is convolution?

Convolution is a fundamental operation in various fields of science and engineering, particularly in signal processing, image processing, and machine learning. It is a mathematical operation that combines two functions to produce a third function, which represents how one function modifies the shape of another. Convolution plays a crucial role in filtering, feature extraction, and pattern recognition tasks.

Intuition

Imagine we’re running a cozy restaurant on a beautiful hill station. We’ve designed a special 3-day meal package for our guests. Here are the package details:

  • Day 1: We welcome our guests on the first day and provide them with one delicious meal—dinner.

  • Day 2: On the second day, we step up our game and offer them three meals—breakfast, lunch, and dinner.

  • Day 3: On the last day of their stay, we bid them farewell with two meals—breakfast and lunch.

Now, let’s say we have a list of guests arriving each day:

  • On Monday, we have two guests.

  • On Tuesday, we have three guests.

  • On Wednesday, we have two guests.

  • On Thursday, we have one guest.

  • On Friday, we have five guests.

The challenge is to determine how many meals we serve each day. It’s not straightforward because guests stay for multiple days, causing their meals to overlap.

To tackle this, let’s flip the guest list so that the first guest is on the right. Then, think of three different dining areas for each day’s meals.

  • On the first day, all guests dine in the first area and enjoy one dinner each.

  • On the second day, guests of the first day move to the second dining area for the second day’s three meals. The three new guests arrive at the first dining area, and we’ll serve them the first day’s meals.

  • Finally, on the third day, the guests of the first day move to the third dining area for final meals, guests of the second day shift to the second dining area for their second day’s meals, and two new guests will be served at the first dining area for their first day’s meal.

By visualizing this process, we can calculate how many meals we serve daily, considering the overlapping guests and meals. We can calculate the usage for any day by flipping the guest list, moving it to the right day, and adding up the meals. Here’s how the total usage looks day by day:

canvasAnimation-image
1 of 7

This calculation represents the convolution of the meal package and the guest list, helping us efficiently manage our restaurant’s operations.

What is convolution?

At its core, convolution is a mathematical operation that expresses the blending of two functions into a third function. It is represented symbolically as:

In the equation above:

  • ffandggfunctions are being convolved.

  • *represent the convolution operation.

  • \sum calculates the area of overlap between two functions as one function is shifted over the other.

  • mm iterates over all possible values from negative to positive infinity, indicating every possible overlap of the two functions.

  • nn is the parameter over which the convolution is performed.

Convolution process

The convolution process involves sliding one function (referred to as the “kernel” or “filter”) over the other function, multiplying their values at each point, and summing the results. This process is illustrated below:

  1. Overlay: Place the kernel function over the target function.

  2. Multiply: Multiply the corresponding values of the two functions.

  3. Sum: Sum up the products obtained from multiplication.

This process continues as the kernel slides or shifts across the target function to produce a new function that captures the kernel’s influence on the target at each position.

canvasAnimation-image
1 of 16

Stride and padding

Two key terms are used in applying the convolution operations on two-dimensional data. Let’s understand these terms.

  • Stride: Stride refers to the number of steps the filter/kernel moves across the input image during the convolution operation. A stride of 1 means the filter moves one pixel at a time, while a larger stride skips some pixels, resulting in a smaller output size.

Stride in the convolution operation
Stride in the convolution operation
  • Padding: Padding adds extra pixels around the input image to ensure that the spatial dimensions of the output after convolution are the same as the input. Padding helps preserve spatial information and prevent the shrinking of the feature maps.

Padding in the convolution operation
Padding in the convolution operation

Stride and padding are used in the calculation of the convolution output size. We can use the following formula to find out the dimensions of the output:

In this equation above:

  • inputwinput_{w} is the width of the input image.

  • kernelwkernel_{w} is the width of the kernel/filter.

  • pp is the padding size.

  • ss is the size of the stride.

  • inputhinput_{h} is the height of the input image.

  • kernelhkernel_{h} is the height of the kernel/filter.

Code examples

Let’s implement the basic convolution operation in Python from scratch:

def convolution(signal, kernel):
output_length = len(signal) + len(kernel) - 1
output = []
for i in range(output_length):
sum_i = 0
for j in range(len(kernel)):
if i - j >= 0 and i - j < len(signal):
sum_i += signal[i - j] * kernel[j]
output.append(sum_i)
return output
signal = [2, 3, 2, 1, 5]
kernel = [1, 3, 2]
result = convolution(signal, kernel)
print("Convolution results:", result)

In the code above:

  • Lines 1–11: We define a convolution() function to implement the convolution operation on simple lists from scratch. This function receives two parameters: signal and kernel. Inside this function:

    • Lines 2–3: We calculate the length of the output output_length by adding up the length of the signal and kernel and subtracting 1 from it. We create an empty list output to store the convolution output.

    • Lines 5–10: We use a for loop on the output_length to calculate each output term. We create a variable sum_i for each output value to calculate the term. We use another for loop on the kernel to multiply it with signal. Inside this nested loop, we use the if statement to find the overlap, multiply the overlap of signal and kernel, and add it to the sum_i term. After calculating the sum_i term, we append this to the output list.

    • Line 11: We return the output list.

  • Lines 13–14: We create two lists to be convole: signal and kernel.

  • Lines 16–17: We call the convolution() function and store the output in the result variable. Lastly, we print the output.

Note: Read our Answer on implementing the 2D convolution function in Python.

Convolution using libraries

Python libraries, such as NumPy and SciPy, provide convolution functions. Let’s use these libraries to perform the convolution operation on simple lists of signal and kernel in the following playground:

import numpy as np
import scipy
signal = [2, 3, 2, 1, 5]
kernel = [1, 3, 2]
print("Convolution result using NumPy:", np.convolve(signal, kernel))
print("Convolution result using SciPy:", scipy.signal.convolve(signal, kernel))

In the code above:

  • Lines 1–2: We import the numpy and scipy libraries.

  • Lines 4–5: We create two lists to be convoled: signal and kernel.

  • Line 7: We call the np.convolve() method to perform the convolution operation and print the output.

  • Line 8: We call the scipy.signal.convolve() method to perform the convolution operation and print the output.

Since NumPy doesn’t provide convolution for two-dimensional data, such as images. Let’s perform the convolution using the SciPy library in the following playground:

import numpy as np
import scipy
signal = np.array([ [3, 2, 5, 3, 1],
[7, 5, 6, 0, 2],
[4, 1, 6, 3, 4],
[2, 4, 2, 3, 1],
[1, 3, 5, 4, 1] ])
kernel = np.array([ [1, -1],
[-1, 1] ])
print("2D convolution result using SciPy:")
print(scipy.signal.convolve2d(signal, kernel, mode='valid'))

In the code above:

  • Lines 4–8: We create a two dimension list and name it signal.

  • Lines 10–11: We create another two dimension list and name it kernel.

  • Lines 14–15: We call the scipy.signal.convolve2d() method to perform the convolution operation and print the output.

Applications of convolution

Convolution has many use cases in various fields:

  • Signal processing: Convolution is extensively used in signal processing for tasks such as filtering, noise reduction, and modulation. For example, in audio processing, we can use convolution to simulate echo effects by convolving an audio signal with an impulse response representing a room’s acoustic properties.

  • Image processing: In image processing, we can apply convolution in various tasks including edge detection, blurring, sharpening, and transpose convolution or deconvolution. Convolutional filters, such as Sobel and Gaussian filters, are commonly used for feature extraction and image enhancement.

  • Neural networks: Convolutional Neural Networks (CNNs) leverage convolutional layers to extract features from input data. CNNs are widely used in computer vision tasks such as object recognition, image classification, and segmentation due to their ability to learn hierarchical representations directly from pixel values.

  • Mathematics: Convolution arises in mathematical areas like matrix multiplication, probability theory, and Fourier analysis. In probability theory, the convolution of probability distributions describes the distribution of the sum of independent random variables. In Fourier analysis, convolution plays a central role in understanding the relationship between functions in time and frequency domains.

  • Natural language processing (NLP): We can apply CNNs to process textual data in NLP tasks, such as sentiment analysis and text classification. CNNs can learn hierarchical representations of text by applying convolutional filters over sequences of words or characters. These learned features capture local patterns and dependencies within the text, enabling CNNs to effectively classify text inputs into different categories or sentiments.

  • Time series analysis: Convolution plays a crucial role in analyzing time series data, such as stock prices, weather patterns, or sensor readings. We can perform trend estimation and anomaly detection by utilizing convolution operations.

  • Medical imaging: In the field of medical imaging, convolution is required for various tasks, including image enhancement, segmentation, and feature extraction. We can apply convolutional filters to medical images (such as X-rays, MRIs, and CT scans) to enhance image quality, detect abnormalities, and delineate anatomical structures. Convolutional neural networks have also revolutionized medical image analysis by enabling automated diagnosis and disease detection from imaging data.

Conclusion

In conclusion, convolution is a crucial operation used in many different fields, from processing signals to training neural networks. It’s like a fundamental tool that helps us combine and manipulate data to understand patterns and features better.

The example of running a restaurant on a hill station helps us grasp the concept in a simple way. Just like serving meals to guests, convolution combines different functions to create new ones that show how one thing affects another. In image processing, convolution helps us make images clearer or find edges. It’s also essential in neural networks, like those used in recognizing objects in pictures or understanding text. Convolution isn’t just limited to these areas. It’s also used in math for things like probability and analyzing trends over time. And in medical imaging, it’s vital for enhancing images and diagnosing diseases.

In summary, convolution is like a building block for understanding and working with data in many different fields. Its versatility and importance make it a key concept in modern technology and science.



Copyright ©2024 Educative, Inc. All rights reserved