Sliding Window Applications

Learn about the concept of search via sliding windows and its applications.

In this lesson, we will explore some applications of sliding window traversal.

Sliding window applications

Traversing an array in subwindows to calculate either the maximum, minimum, or average of each window has many applications, including maximum pooling, minimum pooling, and average pooling.

Maximum pooling

Maximum pooling is one application of the sliding window. With maximum pooling, we can find the maximum element within a specific region (as shown in the animation below). This only gives the maximum values and, this way, the number of output elements is reduced too.

So given an image, if we wanted to extract only the most important features, we can use maximum pooling to extract only the important features and downscale (reduce in size) the image.

Here, using maximum pooling, we create a new array of a smaller size that contains just the maximum values of all the subwindows.

Look at the animation below to see what we mean.

Look at the code for maximum pooling below.

Press + to interact
main.cpp
SW.txt
int winMax(int M[][MaxCols], int wsr, int wsc, int wr, int wc)
{
int Max = M[wsr][wsc];
for (int r = 0; r < wr; r++)
{
for (int c = 0; c < wc; c++)
if (Max < M[wsr + r][wsc + c])
Max = M[wsr + r][wsc + c];
}
return Max;
}
void maxPooling(int M[][MaxCols], int Rows, int Cols, int MMax[][MaxCols], int &MRows, int &MCols, int wr, int wc, int strideR = 1, int strideC = 1)
{
MRows = Rows - wr + 1, MCols = Cols - wc + 1;
for (int wsr = 0; wsr <= Rows - wr; wsr += strideR)
{
for (int wsc = 0; wsc + wc <= Cols; wsc += strideC)
{
MMax[wsr][wsc] = winMax(M, wsr, wsc, wc, wr);
}
}
}
int main()
{
int M[MaxRows][MaxCols], Rows, Cols, wr, wc, strideR, strideC;
Load("SW.txt", M, Rows, Cols, wr, wc, strideR, strideC); // {'S','W', '\0'}
displayGrid("\nThe Matrix is: ", M, Rows, Cols);
int MMax[MaxRows][MaxCols], mRows, mCols;
maxPooling(M, Rows, Cols, MMax, mRows, mCols, wr, wc, strideR, strideC);
displayGrid("\nMax Pooling Matrix: ", MMax, mRows, mCols);
return 0;
}

The code above is an implementation of max pooling, which is a technique commonly used in deep learning and image processing. The code defines two functions, winMax() and maxPooling().

The winMax() function takes a 2D array, M, representing a window of size wr x wc, located at the wsr row and the wsc column, and returns the maximum value within that window. This function is used by the maxPooling() function to calculate the maximum value within each window of the input matrix.

The maxPooling() function takes a 2D array, M, representing the input matrix, its size Rows x Cols, a window size wr x wc, and stride values—strideR and strideC. It applies the max pooling technique to the input matrix, producing a new 2D array, MMax of size mRows x mCols. The maxPooling() function achieves this by iterating over each window in the input matrix, using the winMax() function to calculate the maximum value within that window, and storing the result in the corresponding position in the output matrix.

The main function loads a 2D array, M, from the SW.txt file, and displays it using the displayGrid() function. It then calls the maxPooling() function to apply max pooling to the input matrix and displays the resulting output matrix using the displayGrid() function.

Example 2: Average pooling

Average pooling is just like maximum pooling, except here, we take the average of all the values in the subwindows.

Look at the code below.

Press + to interact
main.cpp
SW.txt
float winAvg(int M[][MaxCols], int wsr, int wsc, int wr, int wc)
{
float Sum = 0; // Summing anchored at[wsr, wsc]
for (int r = 0; r < wr; r++) // A window with wr-rows
{
for (int c = 0; c < wc; c++) // and wc column
Sum += M[wsr + r][wsc + c]; // anchored at[wsr, wsc]
}
return Sum / (wr *wc);
}
void avgPooling(int M[][MaxCols], int Rows, int Cols, float MAvg[][MaxCols],
int &MRows, int &MCols, int wr, int wc, int strideR = 1, int strideC = 1)
{
MRows = Rows - wr + 1, MCols = Cols - wc + 1; // the size of the resultant matrix
for (int wsr = 0; wsr <= Rows - wr; wsr += strideR) // 0 to Rows-wr indices
{
for (int wsc = 0; wsc + wc <= Cols; wsc += strideC) // 0 to Cols-wc indices
{
// which can be rewritten as wsc+wc<=Cols
MAvg[wsr][wsc] = winAvg(M, wsr, wsc, wc, wr);
}
}
}
int main()
{
int M[MaxRows][MaxCols], Rows, Cols, wr, wc, strideR, strideC;
Load("SW.txt", M, Rows, Cols, wr, wc, strideR, strideC); // {'S','W', '\0'}
displayGrid("\n\nThe Matrix", M, Rows, Cols);
float avgM[MaxRows][MaxCols];
int avgRows, avgCols;
avgPooling(M, Rows, Cols, avgM, avgRows, avgCols, wr, wc, strideR, strideC);
displayGrid("\nAvg Pooling Matrix", avgM, avgRows, avgCols);
return 0;
}

Image processing

You must be wondering what’s the real-life purpose of the pooling techniques you learned above. Well then, let’s get right into it.

We know that an image is an array of pixels with values ranging from 0 to 255. What would happen if we applied maximum and average pooling to an array representing image pixels?

We set up a nice playground for you below to experiment with this. It contains the algorithms for maximum and minimum pooling (same as those above), with just one difference; rather than an int array, our algorithms will process unsigned char arrays. Why is this so?

Earlier, we mentioned that an image pixel ranges between 0 and 255, which can adequately be represented by a single byte. Since a char is of one byte, we use that instead of an int of four bytes (to conserve memory).

Feel free to experiment with the code below. Just make sure not to change the loadImage() and saveImage() functions.

Note: The widget below only supports images of dimensions less than or equal to 1024 x 512. You can use this sample image to test the code below.

#include <iostream>
using namespace std;

int winAvg(unsigned char M[][maxCols], int wsr, int wsc, int wr, int wc) {
    float Sum = 0;	// Summing anchored at[wsr, wsc]
	for (int r = 0; r < wr; r++)	// A window with wr-rows
	{
		for (int c = 0; c < wc; c++)	// and wc column
			Sum += M[wsr + r][wsc + c];	// anchored at[wsr, wsc]
	}

	return Sum / (wr *wc);
}

void avgPooling(unsigned char M[][maxCols], int Rows, int Cols, unsigned char MMax[][maxCols], int &MRows, int &MCols, int wr, int wc, int strideR = 1, int strideC = 1)
{
	MRows = Rows - wr + 1, MCols = Cols - wc + 1;
	for (int wsr = 0; wsr <= Rows - wr; wsr += strideR)
	{
		for (int wsc = 0; wsc + wc <= Cols; wsc += strideC)
		{
			MMax[wsr][wsc] = winAvg(M, wsr, wsc, wc, wr);
		}
	}
}

int winMax(unsigned char M[][maxCols], int wsr, int wsc, int wr, int wc)
{
	unsigned char Max = M[wsr][wsc];
	for (int r = 0; r < wr; r++)
	{
		for (int c = 0; c < wc; c++)
			if (Max < M[wsr + r][wsc + c])
				Max = M[wsr + r][wsc + c];
	}

	return Max;
}

void maxPooling(unsigned char M[][maxCols], int Rows, int Cols, unsigned char MMax[][maxCols], int &MRows, int &MCols, int wr, int wc, int strideR = 1, int strideC = 1)
{
	MRows = Rows - wr + 1, MCols = Cols - wc + 1;
	for (int wsr = 0; wsr <= Rows - wr; wsr += strideR)
	{
		for (int wsc = 0; wsc + wc <= Cols; wsc += strideC)
		{
			MMax[wsr][wsc] = winMax(M, wsr, wsc, wc, wr);
		}
	}
}

int main() {

    unsigned char Arr[maxRows][maxCols], ArrMax[maxRows][maxCols], ArrAvg[maxRows][maxCols];
    int rows, cols;
    int wr = 20, wc = 20, strideR = 2, strideC = 2;
    int mRows, mCols;
    int aRows, aCols;
    
    loadImage("__ed_input.png", Arr, rows, cols);
    cout << "Image Dimensions: " << rows << " x " << cols << "\n";

    maxPooling(Arr, rows, cols, ArrMax, mRows, mCols, wr, wc, strideR, strideC);
    cout << "Image Dimensions After Max Pooling: " << mRows << " x " << mCols << "\n";

    avgPooling(Arr, rows, cols, ArrAvg, aRows, aCols, wr, wc, strideR, strideC);
    cout << "Image Dimensions After Avg Pooling: " << aRows << " x " << aCols << "\n";

    saveImage("__ed_input_org.png", Arr, rows, cols);
    saveImage("__ed_input_max.png", ArrMax, mRows, mCols);
    saveImage("__ed_input_avg.png", ArrAvg, aRows, aCols);

    return 0;
}
Processing images

We hope that you had fun playing with the code above. What effects did you observe of maximum and average pooling?

The maximum pooling caused the image to be sharpened, whereas the average pooling blurred the image. Moreover, tweaking wr, wc, strideR, and strideC varied the intensity of these changes.