CNN Building Blocks
Get to know the most commonly used CNN building blocks.
We'll cover the following...
Convolution layer
A neural network wouldn’t be named convolutional if it didn’t have at least one convolution layer. A two-dimensional convolution layer scans the input tensor with a small two-dimensional tensor (the convolution kernel) that has the same number of channels as the input tensor. For each pixel in the input tensor, a scalar product gets computed between the neighborhood around the central pixel and the convolution kernel. The result is a tensor that has a high activation where the input tensor looks like the convolution kernel.
To illustrate the functioning of a convolution layer, let’s design one by hand.
Consider the following image:
Let’s assume we need to highlight the areas where a green surface touches a yellow surface through a vertical boundary. To do that, we’ll convolve the image with the following kernel:
We can manually design the convolution kernel by accessing the weight
and bias
fields of a torch.nn.Conv2d
object:
import torchimport cv2weight = torch.zeros(1, 3, 5, 5) # (N_kernels, N_channels, W, H)bias = torch.zeros(1)weight[0, 0, :, 0: 2] = 90 # The green band on the leftweight[0, 1, :, 0: 2] = 120weight[0, 2, :, 0: 2] = 35weight[0, 0, :, 2:] = 75 # The yellow band on the rightweight[0, 1, :, 2:] = 160weight[0, 2, :, 2:] = 160cv2.imwrite("./output/0_kernel_weight.png", cv2.resize(torch.moveaxis(weight.squeeze(0), 0, 2).int().numpy(), dsize=(300, 300), interpolation=cv2.INTER_NEAREST))# Standardize the weightweight = (weight - torch.mean(weight))/torch.std(weight)bias[0] = 0.
In lines 6–11, we manually set the BGR values of the weight
tensor. In line 16, the bias
value is set to 0
.
We can now perform a convolution ...