How to perform convolution in matrix multiplication

Convolution is a mathematical operation that takes two matrices and merges them into a third matrix. In convolutional neural networks, the first matrix is called the input matrix, the second is a kernel/filter, and the output matrix is called the feature map. In this Answer, we will explore how to perform convolution as matrix multiplication. It helps in image processing and computer vision to extract features from an image.

Note: Convolution is the primary operation involved in convolutional neural networks (CNNs).

Input matrix

The input matrix entails the RGB values of the image. Considering that the image has 3 color channels (RGB), the matrix is usually a 3-D matrix. To show how convolution is applied on matrices, let us consider a 4x4 matrix (input matrix). The matrix is shown below:

Visualization

The operation works by mapping the kernel on the input matrix, performing element-wise multiplication, and adding the sum of the multiplications, which becomes a value of the feature map. After that, we shift the kernel to the right by one column (strideNumber of rows/columns the kernel moves on each iteration = 1) and perform the above-mentioned multiplications until no more columns can be shifted. Once we have iterated through the columns, we again start from the left-hand side of the matrix and shift to the next row (stride = 1) and continue performing the multiplication and sum operations. The step-by-step illustration is shown below:

Free AI Mock Interviews

Coding Interview

Coding PatternsFree Interview

Gain insights and practical experience with coding patterns through targeted MCQs and coding problems, designed to match and challenge your expertise level.

System Design

YouTubeFree Interview

Learn to design a video streaming platform like YouTube by tackling functional and non-functional requirements, core components, and high-level to detailed design challenges.

Free Resources

How to perform convolution in matrix multiplication

Input matrix

Kernel/filter

Convolution operation

Visualization

Conclusion