Dataset Format

Explore the three main Dataset classes and their use cases.

Introduction to the dataset

The PyTorch Image Model framework comes with the following Dataset classes:

  • ImageDataset
  • IterableImageDataset
  • AugMixDataset

Our training data needs to be in the following structure:

<base_folder>
├── train
│   ├── class1
│   ├── class2
│   ├── class3
│   ├── ...
│   └── classN
└── val
    ├── class1
    ├── class2
    ├── class3
    ├── ...
    └── classN

Each subfolder represents the corresponding class and contains relevant images.

The ImageDataset class

We can use the ImageDataset classi to create the training, validation, and test datasets for our image classification model.

It accepts the following arguments:

class ImageDataset(root, parser, class_map, load_bytes, transform) -> Tuple[Any, Any]:
  • root (str): This is the path of our datasets.
  • parser (Union[ParserImageInTar, ParserImageFolder, str]): This is the parser for our datasets. It accepts either an image in a folder or a tar file.
  • class_map (Dict[str, str]): This is a dictionary containing the class mapping.
  • load_bytes (bool
...