Convolution and ReLU

Share

Convolution and ReLU are two integral components of deep learning and neural networks that play a crucial role in the architecture of these fields. Let's first establish a bird's eye view of these fields and then dive deeper into the concepts.

Neural networks

Neural networks are a subdomain of machine learning inspired by the human brain and consist of layers containing interconnected nodes, just like neurons in the brain. Each node takes an input, performs a computation, and returns an output. The output of one layer is the input of another, and therefore, layer by layer, we get closer to the actual result.

The brain and a neural network
The brain and a neural network

Deep learning

We get deep learning when neural networks are used and trained with many layers with the aim of progressively extracting higher-level features through the levels

Convolution and ReLU are two key operations of feature extraction, which, as the name clarifies, separate meaningful features from the specified complex data.

A neural network with different layers
A neural network with different layers

Convolution

A convolution is a mathematical operation that applies a filter to the image or input. The filter slides over parts of the input and produces the sum of the element-wise multiplication of each location. This process is repeated until the entire data is covered, resulting in a feature map.

Each value in the feature map basically shows how strongly or weakly a particular pattern might be present in that specific region. Simply put, we map the actual input to different features.

Different filters extract different features in an input. Edges, textures, shapes, and objects are some of the features a convolutional layer can help the network detect.

Image processing in convolutional neural networks
Image processing in convolutional neural networks

Convolutional layer demonstration

import tensorflow as tf
import numpy as np
import plotly.subplots as sp
import plotly.graph_objects as go

Initially, we begin with importing the necessary modules for our code.

  1. tensorflow for deep learning

  2. numpy for numerical operations

  3. plotly for data visualization

model = tf.keras.applications.VGG16(include_top=False, weights='imagenet')
imgPath = 'images.jpeg'

Next, we load the VGG16 model with pre-trained weights from the ImageNet dataset. This model is widely used for image classification tasks.

We also define the path of the input image we want to process. This image will be used to visualize feature maps.

def preprocessImage(imgPath):
img = tf.keras.preprocessing.image.load_img(imgPath, target_size=(224, 224))
imgArray = tf.keras.preprocessing.image.img_to_array(img)
imgArray = np.expand_dims(imgArray, axis=0)
return tf.keras.applications.vgg16.preprocess_input(imgArray)

To preprocess our submitted image, we define preprocessImage(). It loads the image using load_img, converts it to an array with img_to_array, and expands its dimensions so that it can match the model shape using expand_dims.

After these adjustments, the image is preprocessed using the VGG16-specific preprocessing function.

def visualizeFeatureMaps(intermediateOutput, layerName):
numChannels = intermediateOutput.shape[-1]
gridSize = int(np.ceil(np.sqrt(numChannels)))
fig = sp.make_subplots(rows=gridSize, cols=gridSize)
for i in range(numChannels):
row = i // gridSize + 1
col = i % gridSize + 1
fig.add_trace(
go.Heatmap(
z=intermediateOutput[0, :, :, i],
colorscale='gray',
showscale=False
),
row=row,
col=col
)
fig.update_layout(
title=layerName,
title_font_size=16,
width=800,
height=800
)
fig.write_html("output.html")

For visualizing the feature maps, we define visualizeFeatureMaps(). This function takes the intermediate output of a layer and the layer name as p (intermediateOutput, layerName).

It further determines the number of channels in the output and calculates the grid size for subplots (numChannels and gridSize).

It then iterates over the channels and adds heatmap traces to the subplot figure using Plotlydata visualization Python library. For our ease, we save it in an HTML file called output.html and render it.

Note: A heat map is a graphical representation of data where values are represented as colors on a two-dimensional grid.

imgArray = preprocessImage(imgPath)

Moving on to our main code, we preprocess the input image using the preprocessImage() function and store it in imgArray.

layerName = 'block1_conv1'

We also have to specify the name of the layer whose feature maps we want to visualize. In this example, we use block1_conv1, which is the first convolutional layer of the VGG16 model.

intermediateOutput = model.get_layer(layerName).output
intermediateLayerModel = tf.keras.Model(inputs=model.input, outputs=intermediateOutput)
intermediateOutput = intermediateLayerModel.predict(imgArray)

This code basically retrieves the output of the layer using get_layer() and obtains the intermediate output by creating a new model intermediateLayerModel. This model takes the input of the VGG16 model and outputs the intermediate layer's activation.

We use the intermediateLayerModel to predict the feature maps for the input image stored in imgArray. intermediateOutput now contains our desired result!

visualizeFeatureMaps(intermediateOutput, layerName)

Yay, our goal is now achieved, and we can simply visualize and save our feature maps as an HTML file by calling visualizeFeatureMaps.

Convolution complete code

import tensorflow as tf
import numpy as np
import plotly.subplots as sp
import plotly.graph_objects as go

model = tf.keras.applications.VGG16(include_top=False, weights='imagenet')

imgPath = 'images.jpeg'

def preprocessImage(imgPath):
    img = tf.keras.preprocessing.image.load_img(imgPath, target_size=(224, 224))
    imgArray = tf.keras.preprocessing.image.img_to_array(img)
    imgArray = np.expand_dims(imgArray, axis=0)
    return tf.keras.applications.vgg16.preprocess_input(imgArray)

def visualizeFeatureMaps(intermediateOutput, layerName):
    numChannels = intermediateOutput.shape[-1]
    gridSize = int(np.ceil(np.sqrt(numChannels)))

    fig = sp.make_subplots(rows=gridSize, cols=gridSize)

    for i in range(numChannels):
        row = i // gridSize + 1
        col = i % gridSize + 1
        fig.add_trace(
            go.Heatmap(
                z=intermediateOutput[0, :, :, i],
                colorscale='gray',
                showscale=False
            ),
            row=row,
            col=col
        )

    fig.update_layout(
        title=layerName,
        title_font_size=16,
        width=800,
        height=800
    )

    fig.write_html("output.html")

imgArray = preprocessImage(imgPath)

layerName = 'block1_conv1'
intermediateOutput = model.get_layer(layerName).output
intermediateLayerModel = tf.keras.Model(inputs=model.input, outputs=intermediateOutput)
intermediateOutput = intermediateLayerModel.predict(imgArray)

visualizeFeatureMaps(intermediateOutput, layerName)

Visual depiction of feature maps

We see that a grid of different feature maps is produced, each of which focuses on a different feature in the image. Visualizing feature maps can give insights into how the image is understood by the model and the different levels of abstraction it captures as the information flows through these layers.

Plotly's interactivity

A plus point of using Plotly is that it allows us to zoom into the feature maps and trace the coordinates.

The catch

These raw filter maps can sometimes contain both strong and weak or noisy patterns, and therefore, a need of another function is required.

ReLU

ReLU is a prime example of activation functions in neural networks, including CNNs. ReLU stands for Rectified Linear Unit and is a function that applies non-linearity to our feature maps.

Mathematics and ReLU

ReLU returns zero for any negative input and the same value for any positive input, and it's as simple as that.

It serves as a linear function for positive values i.e. treats them as they are, whereas a non-linear function for negative values i.e. converts them to 0.

ReLU is mathematically represented as f(x)=max(0,x)f(x) = max(0,x)

Significance of ReLU

ReLU only keeps the parts of the feature map that depict strong mapping of features with the data i.e. the presence of a pattern.

By applying ReLU, we create a more focused representation of the image's features, and it becomes more efficient for the neural network to train itself.

ReLU addition to our code

We can simply just add the line below in our code to add our activation function to the model. The tensorflow module applies relu to each i'th feature map.

featureMap = tf.nn.relu(intermediateOutput[0, :, :, i])

import tensorflow as tf
import numpy as np
import plotly.subplots as sp
import plotly.graph_objects as go

model = tf.keras.applications.VGG16(include_top=False, weights='imagenet')

imgPath = 'noise_img.jpeg'

def preprocessImage(imgPath):
    img = tf.keras.preprocessing.image.load_img(imgPath, target_size=(224, 224))
    imgArray = tf.keras.preprocessing.image.img_to_array(img)
    imgArray = np.expand_dims(imgArray, axis=0)
    return tf.keras.applications.vgg16.preprocess_input(imgArray)

def visualizeFeatureMaps(intermediateOutput, layerName):
    numChannels = intermediateOutput.shape[-1]
    gridSize = int(np.ceil(np.sqrt(numChannels)))

    fig = sp.make_subplots(rows=gridSize, cols=gridSize)

    for i in range(numChannels):
        row = i // gridSize + 1
        col = i % gridSize + 1
    
        featureMap = tf.nn.relu(intermediateOutput[0, :, :, i])

        fig.add_trace(
            go.Heatmap(
                z=featureMap.numpy(),
                colorscale='gray',
                showscale=False
            ),
            row=row,
            col=col
        )

    fig.update_layout(
        title=layerName,
        title_font_size=16,
        width=800,
        height=800
    )

    fig.write_html("output.html")

imgArray = preprocessImage(imgPath)

layerName = 'block1_conv1'
intermediateOutput = model.get_layer(layerName).output
intermediateLayerModel = tf.keras.Model(inputs=model.input, outputs=intermediateOutput)
intermediateOutput = intermediateLayerModel.predict(imgArray)

visualizeFeatureMaps(intermediateOutput, layerName)

The ReLU activation ensures that only the positive activations are retained, and any negative activations are set to zero.

Feature maps of a noisy image

This is a picture of a desk on top of which a laptop and monitor are placed, with some notes and badges in the background.

Single feature map depiction

In the following example, after experimenting with the feature maps, we've considered showing the 5th feature map only here.

Congratulations, you now know how to show a single feature map too!

import tensorflow as tf
import numpy as np
import plotly.subplots as sp
import plotly.graph_objects as go

model = tf.keras.applications.VGG16(include_top=False, weights='imagenet')

imgPath = 'noise_img.jpeg'

def preprocessImage(imgPath):
    img = tf.keras.preprocessing.image.load_img(imgPath, target_size=(224, 224))
    imgArray = tf.keras.preprocessing.image.img_to_array(img)
    imgArray = np.expand_dims(imgArray, axis=0)
    return tf.keras.applications.vgg16.preprocess_input(imgArray)

def visualizeBestFeatureMap(intermediateOutput, layerName, bestIndex):
    bestFeatureMap = tf.nn.relu(intermediateOutput[0, :, :, bestIndex])

    fig = go.Figure(data=go.Heatmap(z=bestFeatureMap.numpy(), colorscale='gray'))
    fig.update_layout(title=layerName + ' - Feature Map ' + str(bestIndex),
                      title_font_size = 13,
                      width = 400,
                      height = 400)
    fig.write_html("output.html")

imgArray = preprocessImage(imgPath)

layerName = 'block1_conv1'
intermediateOutput = model.get_layer(layerName).output
intermediateLayerModel = tf.keras.Model(inputs=model.input, outputs=intermediateOutput)
intermediateOutput = intermediateLayerModel.predict(imgArray)

bestIndex = 5
visualizeBestFeatureMap(intermediateOutput, layerName, bestIndex)
 

Single feature map output

In a nutshell

Two takeaways from our answer are mentioned below.

Convolution and ReLU

Convolution

ReLU

Convolution is sliding a filter across the image to create feature maps.

ReLU is then applied to the feature maps to keep only the positive values, aiding the neural network in recognizing important patterns in the data.

Copyright ©2024 Educative, Inc. All rights reserved