Convolution layer

A neural network wouldn’t be named convolutional if it didn’t have at least one convolution layer. A two-dimensional convolution layer scans the input tensor with a small two-dimensional tensor (the convolution kernel) that has the same number of channels as the input tensor. For each pixel in the input tensor, a scalar product gets computed between the neighborhood around the central pixel and the convolution kernel. The result is a tensor that has a high activation where the input tensor looks like the convolution kernel.

To illustrate the functioning of a convolution layer, let’s design one by hand.

Consider the following image:

Press + to interact

C++

import torch
import cv2
weight = torch.zeros(1, 3, 5, 5)  # (N_kernels, N_channels, W, H)
bias = torch.zeros(1)
weight[0, 0, :, 0: 2] = 90   # The green band on the left
weight[0, 1, :, 0: 2] = 120
weight[0, 2, :, 0: 2] = 35
weight[0, 0, :, 2:] = 75  # The yellow band on the right
weight[0, 1, :, 2:] = 160
weight[0, 2, :, 2:] = 160
cv2.imwrite("./output/0_kernel_weight.png", cv2.resize(torch.moveaxis(weight.squeeze(0), 0, 2).int().numpy(), dsize=(300, 300), interpolation=cv2.INTER_NEAREST))
# Standardize the weight
weight = (weight - torch.mean(weight))/torch.std(weight)
bias[0] = 0.

Press + to interact

C++

# ... continued
# Create a 2D convolution layer with the designed weight and bias
conv = torch.nn.Conv2d(3, 1, kernel_size=(3, 3), padding='same')
# The tensors must be wrapped in torch.nn.Parameter objects
conv.weight = torch.nn.Parameter(weight)
conv.bias = torch.nn.Parameter(bias)
# Load an image
original_img = cv2.imread("./images/electronics/rpi_back.jpg")
cv2.imwrite("./output/1_original.png", original_img)
# Convert the image to a tensor
input_tsr = torch.from_numpy(original_img).float()/255.0  # (H, W, C)
input_tsr = torch.moveaxis(input_tsr, 2, 0)  # (C, H, W)
# Compute the convolution on the input tensor
convolution_tsr = conv(input_tsr)
# Save the convolution image
convolution_img = convolution_tsr.squeeze(0).detach().numpy()  # (H, W)
cv2.imwrite("./output/2_convolution.png", 127 + 10 * convolution_img)

Introduction

Getting Started with Images

Image I/O and Annotations

Color Spaces and Thresholding

Convert Color Spaces, Threshold

Smoothing and Masking

Detection of Features

Image Registration

3D Vision

Getting Started with Neural Networks

Convolutional Neural Networks

Project: Create and Train a CNN for Classification

Object Detection and Semantic Segmentation

Cats vs Dogs Classification with Convolutional Neural Networks

Dataset Annotation

Final Remarks

Recognize Handwritten Digits Using a Deep Neural Network

CNN Building Blocks

Convolution layer