Dataset Format
Explore the three main Dataset classes and their use cases.
Introduction to the dataset
The PyTorch Image Model framework comes with the following Dataset
classes:
ImageDataset
IterableImageDataset
AugMixDataset
Our training data needs to be in the following structure:
<base_folder>
├── train
│ ├── class1
│ ├── class2
│ ├── class3
│ ├── ...
│ └── classN
└── val
├── class1
├── class2
├── class3
├── ...
└── classN
Each subfolder represents the corresponding class and contains relevant images.
The ImageDataset
class
We can use the ImageDataset
classi to create the training, validation, and test datasets for our image classification model.
It accepts the following arguments:
class ImageDataset(root, parser, class_map, load_bytes, transform) -> Tuple[Any, Any]:
root
(str
): This is the path of our datasets.parser
(Union[ParserImageInTar, ParserImageFolder, str]
): This is the parser for our datasets. It accepts either an image in a folder or a tar file.class_map
(Dict[str, str]
): This is a dictionary containing the class mapping.load_bytes
(bool
): This specifies whether to load as bytes.transform
(List
): This is a list of image transformations when loading our datasets.
Parser
The ImageDataset
contains a built-in parser
object that’s built upon the create_parser
factory method. The parser
object will find all images defined by the train
and val
folders. The folder structure should be as follows:
train/class1/12345.png
train/class1/12346.png
train/class1/12347.png
...
train/class2/44122.png
train/class2/44123.png
train/class2/44124.png
The subfolders represent the label for the underlying images, such as if we train for a 5-class image classification model. We should have the following folder structure:
train/apple/...
train/banana/...
train/grape/...
train/orange/...
train/pear/...
The parser
object provides the class_to_idx
function to map the classes to integers. For example:
{'apple': 0, 'banana': 1, 'grape': 2, 'orange': 3, 'pear': 4}
There’s also an attribute called samples
which returns a list of tuples:
[('train/apple/12345.png', 0), ('train/banana/22241.png', 1), ..., ('train/pear/44321.png', 4), ('train/pear/4479.png', 4), ...]
The parser
object is subscribable, allowing us to get the items via index.
# syntax
parser[index]
# example
parser[0]
# returns ('train/apple/12345.png', 0)
Example
Let’s look at another example using Multi-class Weather Dataset for Image Classification. The license for these datasets is Creative Commons Attribution 4.0 International. We can download the datasets by doing the following:
Get hands-on with 1300+ tech skills courses.