Modeling
Explore state-of-the-art image segmentation models such as FCN, U-Net, and Mask R-CNN, essential for enhancing self-driving car vision systems. Understand how transfer learning can be applied to improve model performance on limited driving datasets by retraining specific layers. Learn strategies to optimize segmentation accuracy through deep neural network design and training adjustments.
SOTA segmentation models
Machine learning in general and deep learning, in particular, have progressed a lot in the domain of computer vision-based applications during the last decade. The models enlisted in this section are the most commonly used deep neural networks that provide state-of-the-art (SOTA) results for object detection and segmentation tasks. These tasks form the basis for the self-driving car use case.
FCN
Fully convolutional networks (FCNs) are one of the top-performing networks for semantic segmentation tasks.
📝 Segmentation is a dense prediction task of pixel-wise classification.
A typical FCN operates by fine-tuning an image classification CNN and applying pixel-wise training. It first compresses the information using multiple layers of convolutions and pooling. Then, it up-samples these feature maps to predict each pixel’s class from this compressed information.
The convolutional layers at the end (instead of the fully connected layers) also allow for dynamic input size. The output pixel-wise classification tends to be coarse, so you make use of skip connections to achieve good edges. The initial layers capture the edge information through the deepness of the network. You use that information during our upsampling to get more refined segmentation.
U-Net
U-Net is commonly used for semantic segmentation-based vision applications. It is built upon the FCN architecture with some modifications. The architectural changes add a powerful feature in the network to require less training examples. The overall ...