Modeling

In modeling discussion, we will discuss two key aspects. 1. SOTA Segmentation Models: What are the state-of-the-art segmentation models and their architectures? 2. Transfer Learning: How can you use these models to train a better segmenter for the self-driving car data?

SOTA segmentation models

Machine learning in general and deep learning, in particular, have progressed a lot in the domain of computer vision-based applications during the last decade. The models enlisted in this section are the most commonly used deep neural networks that provide state-of-the-art (SOTA) results for object detection and segmentation tasks. These tasks form the basis for the self-driving car use case.

FCN

Fully convolutional networks (FCNs) are one of the top-performing networks for semantic segmentation tasks.

📝 Segmentation is a dense prediction task of pixel-wise classification.

A typical FCN operates by fine-tuning an image classification CNN and applying pixel-wise training. It first compresses the information using multiple layers of convolutions and pooling. Then, it up-samples these feature maps to predict each pixel’s class from this compressed information.

Press + to interact
FCN Architecture
1 / 7
FCN Architecture

The convolutional layers at the end (instead of the fully connected layers) also allow for dynamic input size. The output pixel-wise classification tends to be coarse, so you make use of skip connections to achieve good edges. The initial layers capture the edge information through the deepness of the network. You use that information during our upsampling to get more refined segmentation.

Press + to interact
Accurate segmentation: Skip connections
1 / 3
Accurate segmentation: Skip connections

U-Net

U-Net is commonly used for semantic segmentation-based vision applications. It is built upon the FCN architecture with some modifications. The architectural changes add a powerful feature in the network to require less training examples. The overall ...