Keras image classification

Computer vision is a specialized field within the realm of artificial intelligence that enables machines to process and extract information from visual data depicted in images and videos. Therefore, image classification is an application of this field.

Domains and subdomains
Domains and subdomains

Image classification

Image classification is a technique to help group images and label them according to the pixels or objects detected within the image. It is a branch of computer vision and uses predefined classes to categorize images.

In simple words, we assign a label to a previously unlabelled imagean unseen image given to the model to understand its contents.

Computer vision image classification scenario

Are you interested in the concepts of computer vision but still confused regarding the implementation and real-life applications? This Answer is going to highlight all such concerns in detail with a highly interesting scenario.

Suppose we have been given an unlabelled image, and we aim to create such a model that correctly classifies the image using the nearest possible class from the list of classes it has been taught already.

We teach our model some labels and their image examples
We teach our model some labels and their image examples
We expect our model to be able to label unseen images
We expect our model to be able to label unseen images
It has been labelled a pie chart
It has been labelled a pie chart

We can easily accomplish the task using Keras! So, let's get straight towards it!

Keras

To accomplish image classification through Python coding, we can employ a powerful library named Keras. Keras is a high-level API that is mainly utilized in the deep learning domain. The capabilities of the models it provides can be leveraged in solving image classification tasks.

Scenario walkthrough using Keras

The goal of a classification code is to mainly be able to fulfill the below mentioned steps.

  • Defining the test and validation datasets

  • Defining the model with parameters such as convolutional and pooling layers.

  • Training the model

  • Fitting the model

  • Using the model for predictions

import tensorflow as tf
from tensorflow.keras.preprocessing import image
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import numpy as np
import matplotlib.pyplot as plt
import base64

We import the required modules for our code, including:

  1. tensorflow and keras for model-oriented and image-processing tasks

  2. numpy for numerical operations

  3. matplotlib for visual representations

  4. base64 for encoding images

imageSize = (250, 250)
batchSize = 20
trainDirectory = 'archive/seg_train/seg_train'
testDirectory = 'archive/seg_test/seg_test'

We specify the image and batch size to be used in the training process and save the paths to our training and testing data.

Note: It's preferred to use compressed and resized images if the model has to be trained using a lot of data.

generateTrainingData = ImageDataGenerator(
rescale=1./255,
rotation_range=25,
width_shift_range=0.1,
height_shift_range=0.1,
shear_range=0.1,
zoom_range=0.1,
horizontal_flip=True,
fill_mode='nearest'
)

Since we're using a limited amount of images to train our data, it's a good practice to generate augmented data. Augmented data is artificially generated using the original data by performing different operations like rotations or flips etc.

For this purpose, we set augmentation options like rescale, rotation_change, width_shift_range, height_shift_range, shear_range, zoom_range, horizontal_flip, and fill_mode.

trainDataset = generateTrainingData.flow_from_directory(
trainDirectory,
seed=594,
target_size=imageSize,
batch_size=batchSize,
class_mode='sparse'
)
validationDataset = tf.keras.utils.image_dataset_from_directory(
testDirectory,
seed=594,
image_size=imageSize,
batch_size=batchSize
)

Our images for training are read from the directory, augmentation is applied, and the images are resized. We then test our model's accuracy using the validation set.

classNames = list(trainDataset.class_indices.keys())
classCount = len(classNames)

As our classNames and their count, classCount , will be used in the calculations ahead, we'll define them first.

model = tf.keras.Sequential([
tf.keras.layers.Conv2D(20, 3, activation='relu', input_shape=(imageSize[0], imageSize[1], 3)),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Conv2D(40, 3, activation='relu'),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Conv2D(80, 3, activation='relu'),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(80, activation='relu'),
tf.keras.layers.Dense(classCount)
])

This is one of the most crucial steps in our process. We define the architecture of our convolutional neural network model, which consists of multiple convolutional layers Conv2D, pooling layers MaxPooling2D, flatten layer Flatten, and dense layers Dense.

model.compile(
optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy']
)
history = model.fit(
trainDataset,
validation_data=validationDataset,
epochs=15
)

Next, we compile the model by specifying the optimizer, loss function, and evaluation metrics. The code is then trained and validated using the trainingDataset and validationDataset we defined initially. It runs for a specified number of epochssingle pass of the training dataset.

img = image.load_img('../test.png', target_size=imageSize)
imgArray = image.img_to_array(img)
imgArray = np.expand_dims(imgArray, axis=0)
imgArray = imgArray / 255.0

This code loads any image passed to it, converts it to an array, adds another dimension to match the model shape using np.expand_dims and scales the pixel values between 0 and 1 by dividing imgArray by 255.0.

predictions = model.predict(imgArray)
predictedClassIndex = np.argmax(predictions)
predictedClass = classNames[predictedClassIndex]

Now is the time for prediction! Our model obtains the predicted class probabilities, determines the class index, predictedClassIndex , with the highest probability and retrieves the corresponding class label in predictedClass.

That's how we build an image classification model from scratch and use it to predict unseen images. Let's see it in action now.

Note: Since the training and validation data contained just a few pictures, the model will run fast but might not be too accurate. For better accuracy and complexity more images and categories can be added.

Complete code

Yay, you made it till here! The complete code is given below and can be experimented with by changing the code and pressing "Run".

Our code is trained with limited images for seas and buildings. Therefore, we'll be providing it with an unseen image from one of the two categories to see how well it predicts that image.

Note: Our images have been taken from the "Intel Image Classification" dataset.

Image rendering

We save the prediction in a PNGgraphics format file called output.png , which is then rendered on output.html and displayed to us.

Predicting buildings

We will be using the image 19763.jpg as a parameter for our prediction and see what class it is assigned. This image is of a building originally.

import tensorflow as tf
from tensorflow.keras.preprocessing import image
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import numpy as np
import matplotlib.pyplot as plt
import base64

imageSize = (250, 250)
batchSize = 20

trainDirectory = 'archive/seg_train/seg_train'
testDirectory = 'archive/seg_test/seg_test'

generateTrainingData = ImageDataGenerator(
    rescale=1./255,
    rotation_range=25,
    width_shift_range=0.1,
    height_shift_range=0.1,
    shear_range=0.1,
    zoom_range=0.1,
    horizontal_flip=True,
    fill_mode='nearest'
)

trainDataset = generateTrainingData.flow_from_directory(
    trainDirectory,
    seed=594,
    target_size=imageSize,
    batch_size=batchSize,
    class_mode='sparse'
)

validationDataset = tf.keras.utils.image_dataset_from_directory(
    testDirectory,
    seed=594,
    image_size=imageSize,
    batch_size=batchSize
)

classNames = list(trainDataset.class_indices.keys())
classCount = len(classNames)

model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(20, 3, activation='relu', input_shape=(imageSize[0], imageSize[1], 3)),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Conv2D(40, 3, activation='relu'),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Conv2D(80, 3, activation='relu'),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(80, activation='relu'),
    tf.keras.layers.Dense(classCount)
])

model.compile(
    optimizer='adam',
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=['accuracy']
)

history = model.fit(
    trainDataset,
    validation_data=validationDataset,
    epochs=15
)

img = image.load_img('19763.jpg', target_size=imageSize)
imgArray = image.img_to_array(img)
imgArray = np.expand_dims(imgArray, axis=0)
imgArray = imgArray / 255.0

predictions = model.predict(imgArray)
predictedClassIndex = np.argmax(predictions)
predictedClass = classNames[predictedClassIndex]

plt.imshow(imgArray[0])
plt.title(predictedClass)
plt.savefig('output.png')

html = f'''
<html>
<body>
<h1>Predicted Class: {predictedClass}</h1>
<img src="data:image/png;base64,{base64.b64encode(open('output.png', 'rb').read()).decode('utf-8')}" alt="Output">
</body>
</html>
'''

with open('output.html', 'w') as file:
    file.write(html)

Prediction output

Our trained model uses its data to predict which class resembles the most using various techniques. Since this is a building image, and it closely resembles to the features of our building training data, the model predicts "buildings".

A correct building prediction made by our model
A correct building prediction made by our model

Predicting the sea

Now, we will be using the image test.png as a parameter for our prediction and see what class it is assigned. This image is of the sea originally.

import tensorflow as tf
from tensorflow.keras.preprocessing import image
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import numpy as np
import matplotlib.pyplot as plt
import base64

imageSize = (250, 250)
batchSize = 20

trainDirectory = 'archive/seg_train/seg_train'
testDirectory = 'archive/seg_test/seg_test'

generateTrainingData = ImageDataGenerator(
    rescale=1./255,
    rotation_range=25,
    width_shift_range=0.1,
    height_shift_range=0.1,
    shear_range=0.1,
    zoom_range=0.1,
    horizontal_flip=True,
    fill_mode='nearest'
)

trainDataset = generateTrainingData.flow_from_directory(
    trainDirectory,
    seed=594,
    target_size=imageSize,
    batch_size=batchSize,
    class_mode='sparse'
)

validationDataset = tf.keras.utils.image_dataset_from_directory(
    testDirectory,
    seed=594,
    image_size=imageSize,
    batch_size=batchSize
)

classNames = list(trainDataset.class_indices.keys())
classCount = len(classNames)

model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(20, 3, activation='relu', input_shape=(imageSize[0], imageSize[1], 3)),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Conv2D(40, 3, activation='relu'),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Conv2D(80, 3, activation='relu'),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(80, activation='relu'),
    tf.keras.layers.Dense(classCount)
])

model.compile(
    optimizer='adam',
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=['accuracy']
)

history = model.fit(
    trainDataset,
    validation_data=validationDataset,
    epochs=15
)

img = image.load_img('../test.png', target_size=imageSize)
imgArray = image.img_to_array(img)
imgArray = np.expand_dims(imgArray, axis=0)
imgArray = imgArray / 255.0

predictions = model.predict(imgArray)
predictedClassIndex = np.argmax(predictions)
predictedClass = classNames[predictedClassIndex]

plt.imshow(imgArray[0])
plt.title(predictedClass)
plt.savefig('output.png')

html = f'''
<html>
<body>
<h1>Predicted Class: {predictedClass}</h1>
<img src="data:image/png;base64,{base64.b64encode(open('output.png', 'rb').read()).decode('utf-8')}" alt="Output">
</body>
</html>
'''

with open('output.html', 'w') as file:
    file.write(html)

Prediction output

As this is a sea image, and the sea image resembles closely to the features of our training data, it predicts "sea".

A correct sea prediction made by our model
A correct sea prediction made by our model

How well do you know image classification?

Match The Answer
Select an option from the left-hand side

Training data

A filter passed to the model

Convolutional layer

Teaches the model different scenarios and their outputs


Copyright ©2024 Educative, Inc. All rights reserved