Generating training examples

Autonomous driving systems have to be extremely robust with no margin of error. This requires training each component model with all the possible scenarios that can happen in real life. Let’s see how to generate such training data for performing semantic segmentation.

Human-labeled data

First, you will have to gather driving images and hire people to label the images in a pixel-wise manner. There are many tools available such as Label Box that can facilitate the pixel-wise annotation of images.

Press + to interact

These two methods are effective in generating manually labeled data. In most cases, your training data distribution will match with what you observe in the real-world scene images. However, you may not have enough data for all the conditions that you would like your model to make predictions for such as snow, rain, and night. For a self-autonomous vehicle to be perfect, your segmenter should work well for all the possible weather conditions, as well as cover a variety of obstacle images in the road.

Press + to interact

The sunny vs rainy condition is just one example; there can be many such situations.

It is an important area to think about how you can give your model training data for all conditions. One option is to manually label a lot of examples for each scenario. The second option is to use powerful data augmentation techniques to generate new training examples given your human-labeled data, especially for conditions that are missing in your training set. Let’s discuss the second approach, using generative adversarial networks (GANs).

Training data enhancement through GANs

In the big picture, your self-driving system should compete with human intelligence when it comes to making decisions and planning movements. The segmenter can play its role in creating a reliable system by being able to accurately segment the driving scene in any condition that the vehicle experiences.

Achieving this target requires a diverse set of training data that covers all the possible permutations of the driving scene. For instance, assume that your dataset only contains ten-thousand driving images in the snowy Montreal conditions and fifty-thousand images in the sunny California conditions. You need to devise a way to enhance your training data by converting snowy conditions of the ten-thousand Montreal images to sunny conditions and vice versa.

The target includes two parts:

Generating new training images
Ensuring generated images have different conditions

...

Introduction

Practical ML Techniques/Concepts

Search Ranking

Feed Based System

Recommendation System

Self-Driving Car: Image Segmentation

Entity Linking System

Ad Prediction System

Training Data Generation

Generating training examples

Human-labeled data

Open source datasets

Training data enhancement through GANs