Data Augmentation

Learn to increase the size of input data using data augmentation techniques.

DL models require a lot of data to learn patterns from input data. The performance of a DL model usually increases with the quantity of relevant data. When we have access to a small amount of training data, and we’re unable to collect more data, we utilize a famous approach called data augmentation, eliminating the need for manual data collection. This technique increases the diversity of an existing training set by generating additional examples from the existing data points. Let’s explore TF’s data augmentation techniques for image data.

Common transformations for augmentation

Image augmentation applies random transformations to the images in the original dataset. We can apply a host of transformations to digital images. Image transformations fall into two broad categories:

  • Position-based transformations: These include:

    • Resize: This changes the height and width of the image.

    • Translate: This moves all image pixels horizontally or vertically.

    • Rotate: This revolves an image, to a certain number of degrees, around the center (or some other point) in a clockwise or anticlockwise direction.

    • Zoom in/zoom out: This increases/decreases the size of objects.

    • Horizontal flip/vertical flip: This mirrors an image horizontally/vertically.

    • Crop: This removes part of an image.

    • Skew: This slants an image horizontally or vertically

    • Affine: These are transformations that preserve parallel lines in an image.

  • Pixel value-based transformations: These include:

    • Brightness change: This makes the image darker or brighter.

    • Contrast change: This increases the difference between dark and light objects.

    • Applying filters: These blur or sharpen the objects present in an image.

    • Adding noise: These include Gaussian and salt & pepper noise.

Overfitting and data augmentation

Training a model on insufficient data can result in overfitting.

Note: Overfitting occurs when a model tries to fit every training example and noise in the training set. Insufficient amount of training data is one of the major causes of overfitting. An overfit model performs poorly on test data.

Datasets with a few training examples lack diversity. For instance, when we train a model to recognize human faces, and our model sees only a few examples of faces, the model is likely not to recognize faces in a variety of situations, such as zoomed-in, rotated, or translated face images. Furthermore, a trained model might not recognize face images under changing brightness conditions. The images given below depict this concept. The image on the left, taken from the Labeled Faces in the Wild (LFW) dataset, is zoomed in on the right.

Get hands-on with 1400+ tech skills courses.