Convolution and ReLU are two integral components of deep learning and neural networks that play a crucial role in the architecture of these fields. Let's first establish a bird's eye view of these fields and then dive deeper into the concepts.
Neural networks are a subdomain of machine learning inspired by the human brain and consist of layers containing interconnected nodes, just like neurons in the brain. Each node takes an input, performs a computation, and returns an output. The output of one layer is the input of another, and therefore, layer by layer, we get closer to the actual result.
We get deep learning when neural networks are used and trained with many layers with the aim of progressively extracting higher-level features through the levels
Convolution and ReLU are two key operations of feature extraction, which, as the name clarifies, separate meaningful features from the specified complex data.
A convolution is a mathematical operation that applies a filter to the image or input. The filter slides over parts of the input and produces the sum of the element-wise multiplication of each location. This process is repeated until the entire data is covered, resulting in a feature map.
Each value in the feature map basically shows how strongly or weakly a particular pattern might be present in that specific region. Simply put, we map the actual input to different features.
Different filters extract different features in an input. Edges, textures, shapes, and objects are some of the features a convolutional layer can help the network detect.
import tensorflow as tfimport numpy as npimport plotly.subplots as spimport plotly.graph_objects as go
Initially, we begin with importing the necessary modules for our code.
tensorflow
for deep learning
numpy
for numerical operations
plotly
for data visualization
model = tf.keras.applications.VGG16(include_top=False, weights='imagenet')imgPath = 'images.jpeg'
Next, we load the VGG16 model with pre-trained weights from the ImageNet dataset. This model is widely used for image classification tasks.
We also define the path of the input image we want to process. This image will be used to visualize feature maps.
def preprocessImage(imgPath):img = tf.keras.preprocessing.image.load_img(imgPath, target_size=(224, 224))imgArray = tf.keras.preprocessing.image.img_to_array(img)imgArray = np.expand_dims(imgArray, axis=0)return tf.keras.applications.vgg16.preprocess_input(imgArray)
To preprocess our submitted image, we define preprocessImage()
. It loads the image using load_img
, converts it to an array with img_to_array
, and expands its dimensions so that it can match the model shape using expand_dims
.
After these adjustments, the image is preprocessed using the VGG16-specific preprocessing function.
def visualizeFeatureMaps(intermediateOutput, layerName):numChannels = intermediateOutput.shape[-1]gridSize = int(np.ceil(np.sqrt(numChannels)))fig = sp.make_subplots(rows=gridSize, cols=gridSize)for i in range(numChannels):row = i // gridSize + 1col = i % gridSize + 1fig.add_trace(go.Heatmap(z=intermediateOutput[0, :, :, i],colorscale='gray',showscale=False),row=row,col=col)fig.update_layout(title=layerName,title_font_size=16,width=800,height=800)fig.write_html("output.html")
For visualizing the feature maps, we define visualizeFeatureMaps()
. This function takes the intermediate output of a layer and the layer name as p (intermediateOutput, layerName
).
It further determines the number of channels in the output and calculates the grid size for subplots (numChannels
and gridSize
).
It then iterates over the channels and adds heatmap
traces to the subplot figure using output.html
and render it.
Note: A heat map is a graphical representation of data where values are represented as colors on a two-dimensional grid.
imgArray = preprocessImage(imgPath)
Moving on to our main code, we preprocess the input image using the preprocessImage()
function and store it in imgArray
.
layerName = 'block1_conv1'
We also have to specify the name of the layer whose feature maps we want to visualize. In this example, we use block1_conv1, which is the first convolutional layer of the VGG16 model.
intermediateOutput = model.get_layer(layerName).outputintermediateLayerModel = tf.keras.Model(inputs=model.input, outputs=intermediateOutput)intermediateOutput = intermediateLayerModel.predict(imgArray)
This code basically retrieves the output of the layer using get_layer()
and obtains the intermediate output by creating a new model intermediateLayerModel
. This model takes the input of the VGG16 model and outputs the intermediate layer's activation.
We use the intermediateLayerModel
to predict the feature maps for the input image stored in imgArray
. intermediateOutput
now contains our desired result!
visualizeFeatureMaps(intermediateOutput, layerName)
Yay, our goal is now achieved, and we can simply visualize and save our feature maps as an HTML file by calling visualizeFeatureMaps
.
import tensorflow as tf import numpy as np import plotly.subplots as sp import plotly.graph_objects as go model = tf.keras.applications.VGG16(include_top=False, weights='imagenet') imgPath = 'images.jpeg' def preprocessImage(imgPath): img = tf.keras.preprocessing.image.load_img(imgPath, target_size=(224, 224)) imgArray = tf.keras.preprocessing.image.img_to_array(img) imgArray = np.expand_dims(imgArray, axis=0) return tf.keras.applications.vgg16.preprocess_input(imgArray) def visualizeFeatureMaps(intermediateOutput, layerName): numChannels = intermediateOutput.shape[-1] gridSize = int(np.ceil(np.sqrt(numChannels))) fig = sp.make_subplots(rows=gridSize, cols=gridSize) for i in range(numChannels): row = i // gridSize + 1 col = i % gridSize + 1 fig.add_trace( go.Heatmap( z=intermediateOutput[0, :, :, i], colorscale='gray', showscale=False ), row=row, col=col ) fig.update_layout( title=layerName, title_font_size=16, width=800, height=800 ) fig.write_html("output.html") imgArray = preprocessImage(imgPath) layerName = 'block1_conv1' intermediateOutput = model.get_layer(layerName).output intermediateLayerModel = tf.keras.Model(inputs=model.input, outputs=intermediateOutput) intermediateOutput = intermediateLayerModel.predict(imgArray) visualizeFeatureMaps(intermediateOutput, layerName)
We see that a grid of different feature maps is produced, each of which focuses on a different feature in the image. Visualizing feature maps can give insights into how the image is understood by the model and the different levels of abstraction it captures as the information flows through these layers.
A plus point of using Plotly is that it allows us to zoom into the feature maps and trace the coordinates.
These raw filter maps can sometimes contain both strong and weak or noisy patterns, and therefore, a need of another function is required.
ReLU is a prime example of activation functions in neural networks, including CNNs. ReLU stands for Rectified Linear Unit and is a function that applies non-linearity to our feature maps.
ReLU returns zero for any negative input and the same value for any positive input, and it's as simple as that.
It serves as a linear function for positive values i.e. treats them as they are, whereas a non-linear function for negative values i.e. converts them to 0.
ReLU is mathematically represented as
ReLU only keeps the parts of the feature map that depict strong mapping of features with the data i.e. the presence of a pattern.
By applying ReLU, we create a more focused representation of the image's features, and it becomes more efficient for the neural network to train itself.
We can simply just add the line below in our code to add our activation function to the model. The tensorflow
module applies relu
to each i'th feature map.
featureMap = tf.nn.relu(intermediateOutput[0, :, :, i])
import tensorflow as tf import numpy as np import plotly.subplots as sp import plotly.graph_objects as go model = tf.keras.applications.VGG16(include_top=False, weights='imagenet') imgPath = 'noise_img.jpeg' def preprocessImage(imgPath): img = tf.keras.preprocessing.image.load_img(imgPath, target_size=(224, 224)) imgArray = tf.keras.preprocessing.image.img_to_array(img) imgArray = np.expand_dims(imgArray, axis=0) return tf.keras.applications.vgg16.preprocess_input(imgArray) def visualizeFeatureMaps(intermediateOutput, layerName): numChannels = intermediateOutput.shape[-1] gridSize = int(np.ceil(np.sqrt(numChannels))) fig = sp.make_subplots(rows=gridSize, cols=gridSize) for i in range(numChannels): row = i // gridSize + 1 col = i % gridSize + 1 featureMap = tf.nn.relu(intermediateOutput[0, :, :, i]) fig.add_trace( go.Heatmap( z=featureMap.numpy(), colorscale='gray', showscale=False ), row=row, col=col ) fig.update_layout( title=layerName, title_font_size=16, width=800, height=800 ) fig.write_html("output.html") imgArray = preprocessImage(imgPath) layerName = 'block1_conv1' intermediateOutput = model.get_layer(layerName).output intermediateLayerModel = tf.keras.Model(inputs=model.input, outputs=intermediateOutput) intermediateOutput = intermediateLayerModel.predict(imgArray) visualizeFeatureMaps(intermediateOutput, layerName)
The ReLU activation ensures that only the positive activations are retained, and any negative activations are set to zero.
This is a picture of a desk on top of which a laptop and monitor are placed, with some notes and badges in the background.
In the following example, after experimenting with the feature maps, we've considered showing the 5th feature map only here.
Congratulations, you now know how to show a single feature map too!
import tensorflow as tf import numpy as np import plotly.subplots as sp import plotly.graph_objects as go model = tf.keras.applications.VGG16(include_top=False, weights='imagenet') imgPath = 'noise_img.jpeg' def preprocessImage(imgPath): img = tf.keras.preprocessing.image.load_img(imgPath, target_size=(224, 224)) imgArray = tf.keras.preprocessing.image.img_to_array(img) imgArray = np.expand_dims(imgArray, axis=0) return tf.keras.applications.vgg16.preprocess_input(imgArray) def visualizeBestFeatureMap(intermediateOutput, layerName, bestIndex): bestFeatureMap = tf.nn.relu(intermediateOutput[0, :, :, bestIndex]) fig = go.Figure(data=go.Heatmap(z=bestFeatureMap.numpy(), colorscale='gray')) fig.update_layout(title=layerName + ' - Feature Map ' + str(bestIndex), title_font_size = 13, width = 400, height = 400) fig.write_html("output.html") imgArray = preprocessImage(imgPath) layerName = 'block1_conv1' intermediateOutput = model.get_layer(layerName).output intermediateLayerModel = tf.keras.Model(inputs=model.input, outputs=intermediateOutput) intermediateOutput = intermediateLayerModel.predict(imgArray) bestIndex = 5 visualizeBestFeatureMap(intermediateOutput, layerName, bestIndex)
Two takeaways from our answer are mentioned below.
Convolution | ReLU |
Convolution is sliding a filter across the image to create feature maps. | ReLU is then applied to the feature maps to keep only the positive values, aiding the neural network in recognizing important patterns in the data. |