DataLoader
Learn about the DataLoader class and how that can bring several new changes in our implementation.
We'll cover the following...
Introduction to DataLoader
Until now, we have used the whole training data at every training step. It has been batch gradient descent all along. This is fine for our small dataset, but if we want to make our work much more efficient and less computationally expensive, we must use mini-batch gradient descent. Thus, we need mini-batches, and we need to slice our dataset accordingly. Do you want to do it manually? Me neither!
So, we use PyTorch’s DataLoader
class for this job. However, we have to tell it which dataset to use. In this case, we’ll select the dataset from the previous lesson, the desired mini-batch size, and if we would like to shuffle it or not. That’s it!

IMPORTANT: In the absolute majority of cases, you should set
shuffle=True
for your training set to improve the performance of gradient descent. There are a few exceptions though. For example, time series problems are an exception where shuffling actually leads to data leakage.
So, always ask yourself: “do I have a reason not to shuffle the data?”
“What about the validation and test sets?” There is no need to shuffle them since we are not computing gradients with them.
There is more to a
DataLoader
than meets the eye. For instance, it is also possible to use it together with a sampler to ...
Our loader will behave like an ...