Mixup and Cutmix

Learn to train neural networks with Mixup and Cutmix augmentations.

We'll cover the following

The Pytorch Image Model (timm) framework provides an option to use Mixup and Cutmix augmentations. We can use these techniques to enhance the performance of our model.

Mixup

Mixup is a domain-agnostic augmentation technique. It randomly generates weighted combinations of image pairs from the training data. It takes two images and their corresponding ground truths to generate a new image.

The implementation for Mix up looks like this:

x~=λxi+(1λ)xj\tilde{x} = \lambda x_i + (1 - \lambda) x_j

y~=λyi+(1λ)yj\tilde{y} = \lambda y_i + (1 - \lambda) y_j

  • xix_{i}: This is the first raw input vector of a random image from the training data.
  • xjx_{j}: This is the second raw input vector of a random sample from the training data.
  • yiy_{i}: This is the first one-hot label encoding a random sample from the training data.
  • yjy_{j}: This is the second one-hot label encoding a random sample from the training data.
  • λλ: This is a lambda that represents a random value from the Beta distribution. The value is between 0 and 1.

The timm.data.mixup class provides all the functionalities for Mixup augmentations.

Define a new function, which we’ll call the ImageDataset class. Then, call the create_loader function to load our datasets.

Let’s look at the following code snippet as a reference:

Get hands-on with 1400+ tech skills courses.