What is data augmentation?

In this era of AI, data plays a crucial role. Companies spend millions of dollars to collect proprietary datasets as open source datasets are very limited. One way to deal with the shortage of data is to use data augmentation.

Data augmentation is the process of artificially transforming the data to increase it. In data augmentation, the transformations are applied to the input data and then tested on the actual test data. It is the practitioner's responsibility to ensure that no such transformations are performed that may transform the data into something that we are not expecting to see in the test dataset.

Applications of data augmentation

Data augmentation is most commonly used in the following types of datasets.

Image-based datasets
Signal-based datasets
Textual datasets

Image-based datasets

It is very common to apply shifting and rotations when working with image-based datasets. Sometimes, images are very hard to collect. We can use some transformations to help create images that we expect to see in the testing stage. Some transformations are:

Random cropping
Random zooms
Random rotations
Random flipping
Random noise
Random contrasts
Random saturations
Random brightness
Sharpen or blur

Note: Augmenting images may not be helpful in some situations. In fact, these transformations can prove to be disastrous. For example, if we are building a handwritten text classifier, we would not apply flipping.

Signal-based dataset

It is a common practice to add noise to signals so we can make the model robust. For example, noise is added to audio signals so the classification model would be more robust. This way, the model trained on this dataset will be more robust to classification. Other common transformations include shifting the signal and changing the speed of the signal.

Generative Adversarial Networks (GANs) have also been used to augment signal-based datasets. It synthesizes new data which is close to the original input datasets.

Textual dataset

One of the most important transformation techniques is back translation. In this technique, we take a sentence in a language and then use translation tools like google translate to get the translation in a different language. Then we again translate this text back into the first language which gives a different text than the original one.

Other transformations include word replacement, in which we replace some random words with their synonyms.

Auto-augmentation

In cases when training multiple times is not an issue, we can try out different types of data augmentation techniques and find out which one works the best. We can also use multiple data augmentation techniques in combination. This technique of optimizing using different augmentation techniques is called AutoAugmentation.

Example of image data augmentation

Here, we use ImageDataGenerator to augment an image.

import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.utils import load_img, img_to_array
from tensorflow.keras.preprocessing.image import ImageDataGenerator
# loading image to use.
img = load_img('Detective.png')
data = img_to_array(img)
samples = np.expand_dims(data, 0)
# defining the transformations that we need
data_gen = ImageDataGenerator(
	rotation_range=20,
	horizontal_flip=True, 
	vertical_flip=True
)
it = data_gen.flow(samples, batch_size=1)
fig, ax = plt.subplots(2, 2)
for i in range(2):
	for j in range(2):
		batch = it.next()
		image = batch[0].astype('uint8')
		ax[i, j].imshow(image)
		ax[i, j].axis('off')
plt.show()

Code explanation

Lines 7–9: We load an image and convert it to a NumPy array. Since ImageDataGenerator takes multiple images at once, we need to input an array with multiple images. Hence, we use the np.expand_dims()function to do the task.
Lines 12–16: We define what our ImageDataGenerator is supposed to do. We want it to randomly rotate images up to 20 degrees and flip them vertically and horizontally.
Line 17: We convert the data generator to an iterator object which will give us one augmented image when we call .next() on it.
Lines 19–27: We plot augmented images in a $2 \times 2$ figure.

Free Resources

Learn in-demand tech skills in half the time

PRODUCTS

Mock Interview

New

Courses

Skill Paths

Projects

Assessments