Convolution is a mathematical operation that takes two matrices and merges them into a third matrix. In convolutional neural networks, the first matrix is called the input matrix, the second is a kernel/filter, and the output matrix is called the feature map. In this Answer, we will explore how to perform convolution as matrix multiplication. It helps in image processing and computer vision to extract features from an image.
Note: Convolution is the primary operation involved in convolutional neural networks (CNNs).
The input matrix entails the RGB values of the image. Considering that the image has 3 color channels (RGB), the matrix is usually a 3-D matrix. To show how convolution is applied on matrices, let us consider a 4x4 matrix (input matrix). The matrix is shown below:
To apply the convolution operation, let's take the second matrix (kernel/filter) as shown in the illustration below:
The convolution operation is denoted by the asterisk (*) sign, as shown below:
The operation works by mapping the kernel on the input matrix, performing element-wise multiplication, and adding the sum of the multiplications, which becomes a value of the feature map. After that, we shift the kernel to the right by one column (
By following the above illustration, we can easily apply convolution on any two sets of matrices.
Convolution is an important operation in image processing for detecting edges, blurring, sharpening edges, etc. The filter type tells which feature is to be extracted.