Sliding Window Applications

Learn about the concept of search via sliding windows and its applications.

We'll cover the following

Sliding window applications
- Maximum pooling
- Example 2: Average pooling
Image processing

Sliding window applications

Traversing an array in subwindows to calculate either the maximum, minimum, or average of each window has many applications, including maximum pooling, minimum pooling, and average pooling.

Maximum pooling

Maximum pooling is one application of the sliding window. With maximum pooling, we can find the maximum element within a specific region (as shown in the animation below). This only gives the maximum values and, this way, the number of output elements is reduced too.

So given an image, if we wanted to extract only the most important features, we can use maximum pooling to extract only the important features and downscale (reduce in size) the image.

Here, using maximum pooling, we create a new array of a smaller size that contains just the maximum values of all the subwindows.

Look at the animation below to see what we mean.

Press + to interact

main.cpp

SW.txt

int winMax(int M[][MaxCols], int wsr, int wsc, int wr, int wc)
{
	int Max = M[wsr][wsc];
	for (int r = 0; r < wr; r++)
	{
		for (int c = 0; c < wc; c++)
			if (Max < M[wsr + r][wsc + c])
				Max = M[wsr + r][wsc + c];
	}
	return Max;
}
void maxPooling(int M[][MaxCols], int Rows, int Cols, int MMax[][MaxCols], int &MRows, int &MCols, int wr, int wc, int strideR = 1, int strideC = 1)
{
	MRows = Rows - wr + 1, MCols = Cols - wc + 1;
	for (int wsr = 0; wsr <= Rows - wr; wsr += strideR)
	{
		for (int wsc = 0; wsc + wc <= Cols; wsc += strideC)
		{
			MMax[wsr][wsc] = winMax(M, wsr, wsc, wc, wr);
		}
	}
}
int main()
{
	int M[MaxRows][MaxCols], Rows, Cols, wr, wc, strideR, strideC;
	Load("SW.txt", M, Rows, Cols, wr, wc, strideR, strideC);	// {'S','W', '\0'}
	displayGrid("\nThe Matrix is: ", M, Rows, Cols);
	int MMax[MaxRows][MaxCols], mRows, mCols;
	maxPooling(M, Rows, Cols, MMax, mRows, mCols, wr, wc, strideR, strideC);
	displayGrid("\nMax Pooling Matrix: ", MMax, mRows, mCols);
	return 0;
}

The code above is an implementation of max pooling, which is a technique commonly used in deep learning and image processing. The code defines two functions, winMax() and maxPooling().

The winMax() function takes a 2D array, M, representing a window of size wr x wc, located at the wsr row and the wsc column, and returns the maximum value within that window. This function is used by the maxPooling() function to calculate the maximum value within each window of the input matrix.

The maxPooling() function takes a 2D array, M, representing the input matrix, its size Rows x Cols, a window size wr x wc, and stride values—strideR and strideC. It applies the max pooling technique to the input matrix, producing a new 2D array, MMax of size mRows x mCols. The maxPooling() function achieves this by iterating over each window in the input matrix, using the winMax() function to calculate the maximum value within that window, and storing the result in the corresponding position in the output matrix.

The main function loads a 2D array, M, from the SW.txt file, and displays it using the displayGrid() function. It then calls the maxPooling() function to apply max pooling to the input matrix and displays the resulting output matrix using the displayGrid() function.

Press + to interact

main.cpp

SW.txt

float winAvg(int M[][MaxCols], int wsr, int wsc, int wr, int wc)
{
	float Sum = 0;	// Summing anchored at[wsr, wsc]
	for (int r = 0; r < wr; r++)	// A window with wr-rows
	{
		for (int c = 0; c < wc; c++)	// and wc column
			Sum += M[wsr + r][wsc + c];	// anchored at[wsr, wsc]
	}
	return Sum / (wr *wc);
}
void avgPooling(int M[][MaxCols], int Rows, int Cols, float MAvg[][MaxCols],
	int &MRows, int &MCols, int wr, int wc, int strideR = 1, int strideC = 1)
{
	MRows = Rows - wr + 1, MCols = Cols - wc + 1;	// the size of the resultant matrix
	for (int wsr = 0; wsr <= Rows - wr; wsr += strideR)	// 0 to Rows-wr indices
	{
		for (int wsc = 0; wsc + wc <= Cols; wsc += strideC)	// 0 to Cols-wc indices 
		{ 
			// which can be rewritten as wsc+wc<=Cols
			MAvg[wsr][wsc] = winAvg(M, wsr, wsc, wc, wr);
		}
	}
}
int main()
{
	int M[MaxRows][MaxCols], Rows, Cols, wr, wc, strideR, strideC;
	Load("SW.txt", M, Rows, Cols, wr, wc, strideR, strideC);	// {'S','W', '\0'}
	displayGrid("\n\nThe Matrix", M, Rows, Cols);
	float avgM[MaxRows][MaxCols];
	int avgRows, avgCols;
	avgPooling(M, Rows, Cols, avgM, avgRows, avgCols, wr, wc, strideR, strideC);
	displayGrid("\nAvg Pooling Matrix", avgM, avgRows, avgCols);
	return 0;
}

Image processing

You must be wondering what’s the real-life purpose of the pooling techniques you learned above. Well then, let’s get right into it.

We know that an image is an array of pixels with values ranging from 0 to 255. What would happen if we applied maximum and average pooling to an array representing image pixels?

We set up a nice playground for you below to experiment with this. It contains the algorithms for maximum and minimum pooling (same as those above), with just one difference; rather than an int array, our algorithms will process unsigned char arrays. Why is this so?

Earlier, we mentioned that an image pixel ranges between 0 and 255, which can adequately be represented by a single byte. Since a char is of one byte, we use that instead of an int of four bytes (to conserve memory).

Feel free to experiment with the code below. Just make sure not to change the loadImage() and saveImage() functions.

Note: The widget below only supports images of dimensions less than or equal to 1024 x 512. You can use this sample image to test the code below.

#include <iostream>
using namespace std;

int winAvg(unsigned char M[][maxCols], int wsr, int wsc, int wr, int wc) {
    float Sum = 0;	// Summing anchored at[wsr, wsc]
	for (int r = 0; r < wr; r++)	// A window with wr-rows
	{
		for (int c = 0; c < wc; c++)	// and wc column
			Sum += M[wsr + r][wsc + c];	// anchored at[wsr, wsc]
	}

	return Sum / (wr *wc);
}

void avgPooling(unsigned char M[][maxCols], int Rows, int Cols, unsigned char MMax[][maxCols], int &MRows, int &MCols, int wr, int wc, int strideR = 1, int strideC = 1)
{
	MRows = Rows - wr + 1, MCols = Cols - wc + 1;
	for (int wsr = 0; wsr <= Rows - wr; wsr += strideR)
	{
		for (int wsc = 0; wsc + wc <= Cols; wsc += strideC)
		{
			MMax[wsr][wsc] = winAvg(M, wsr, wsc, wc, wr);
		}
	}
}

int winMax(unsigned char M[][maxCols], int wsr, int wsc, int wr, int wc)
{
	unsigned char Max = M[wsr][wsc];
	for (int r = 0; r < wr; r++)
	{
		for (int c = 0; c < wc; c++)
			if (Max < M[wsr + r][wsc + c])
				Max = M[wsr + r][wsc + c];
	}

	return Max;
}

void maxPooling(unsigned char M[][maxCols], int Rows, int Cols, unsigned char MMax[][maxCols], int &MRows, int &MCols, int wr, int wc, int strideR = 1, int strideC = 1)
{
	MRows = Rows - wr + 1, MCols = Cols - wc + 1;
	for (int wsr = 0; wsr <= Rows - wr; wsr += strideR)
	{
		for (int wsc = 0; wsc + wc <= Cols; wsc += strideC)
		{
			MMax[wsr][wsc] = winMax(M, wsr, wsc, wc, wr);
		}
	}
}

int main() {

    unsigned char Arr[maxRows][maxCols], ArrMax[maxRows][maxCols], ArrAvg[maxRows][maxCols];
    int rows, cols;
    int wr = 20, wc = 20, strideR = 2, strideC = 2;
    int mRows, mCols;
    int aRows, aCols;
    
    loadImage("__ed_input.png", Arr, rows, cols);
    cout << "Image Dimensions: " << rows << " x " << cols << "\n";

    maxPooling(Arr, rows, cols, ArrMax, mRows, mCols, wr, wc, strideR, strideC);
    cout << "Image Dimensions After Max Pooling: " << mRows << " x " << mCols << "\n";

    avgPooling(Arr, rows, cols, ArrAvg, aRows, aCols, wr, wc, strideR, strideC);
    cout << "Image Dimensions After Avg Pooling: " << aRows << " x " << aCols << "\n";

    saveImage("__ed_input_org.png", Arr, rows, cols);
    saveImage("__ed_input_max.png", ArrMax, mRows, mCols);
    saveImage("__ed_input_avg.png", ArrAvg, aRows, aCols);

    return 0;
}

Processing images

Getting Started

Computing in the Real World

Introduction to C++ (Table Printing)

GNU Debugger Tutorial

Memory, Operations and Control Structure

Divide, Conquer, and Rule

Iteration, Reusability and Extensibility

Introduction to Arrays

Problem Solving With Arrays

Multi-dimensional Arrays

Sliding Window

Conclusion

Sliding Window Applications

Sliding window applications

Maximum pooling

Example 2: Average pooling

Image processing