What is instance segmentation?

The first step in the practical use of computer visionA field of AI in which we get information from images, videos, and other visual inputs is object classification. In object classification, the whole input is given one label output. Even if multiple objects are present in the image, all of them are given one label.

After object classification comes object detection. In object detection or object localization, we put bounding boxes or centroids around all the objects in the image. In object detection, two outputs are generated—labels of the objects and their bounding boxes.

After object detection, we perform semantic segmentation. In semantic segmentation, we assign a class label to every pixel. In this way, semantic segmentation allows us to detect uncountable objects like pavements, sky, and so on. Outputs for semantic segmentation are as follows:

  • Labels of the objects
  • Their bounding boxes
  • A prediction map

The following is an example in which, using semantic segmentation, every pixel is assigned a class. Notice how different cats are labeled in the same way.

The semantic segmentation
The semantic segmentation

Instance segmentation

The idea behind instance segmentation is that one image may contain multiple instances of an object. We would like to label every instance differently.

This task is more difficult than semantic segmentation as we are no longer simply assigning a label to a pixel. Instead, we also need to differentiate between the different instances of the object.

The instance segmentation
The instance segmentation

Approach

A simple way to perform instance segmentation is to use the bounding boxes generated by object detection and then apply semantic segmentation to just that portion of the image.

In this way, we generate a binary mask of every object in the image. This binary mask has the same dimensions as that of the original image. Each binary mask has ones in place of the pixels that are included in the corresponding instance. With this approach, we can generate different binary masks for different class instances.

A binary mask of the second cat from the left is as follows:

The binary mask of the second cat
The binary mask of the second cat

Applications

  • Medical imaging: Instance segmentation is used extensively in medical imaging. We can detect anomalies that are hard for humans to detect. Automating medical imaging makes it less prone to errors. We can use instance segmentation for the detection of tumor and cancerous cells.
  • Satellite image processing: It is also used in satellite image processing. We can use instance segmentation to count the number of cars in an area and figure out the congestion on the road.
  • Self-driving cars: Instance segmentation is combined with semantic segmentation to perform panoptic segmentation, which is used in self-driving cars.

Note: To learn more about panoptic segmentation, click here.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved