Deep Learning with PyTorch Step-by-Step: Part I - Fundamentals/

...

Rethinking the Training Loop

Learn how you can reduce the boilerplate section from the training loop by using higher-order functions.

We'll cover the following...

Training step

Creating the higher-order function for training step

Updating model configuration code
Updating model training code
What comes next?

Training step

As already mentioned, the higher-order function that builds a training step function for us is taking the key elements of our training loop: model, loss, and optimizer. The actual training step function to be returned will have two arguments, namely, features and labels, and will return the corresponding loss value.

Creating the higher-order function for training step

Apart from returning the loss value, the inner perform_train_step() function below is the same as the code inside the loop in model training V0. The code should look like this:

Press + to interact

def make_train_step(model, loss_fn, optimizer):
    # Builds function that performs a step in the train loop
    def perform_train_step(x, y):
        # Sets model to TRAIN mode
        model.train()
        
        # Step 1 - computes model's predictions - forward pass
        yhat = model(x)
        # Step 2 - computes the loss
        loss = loss_fn(yhat, y)
        # Step 3 - computes gradients for "b" and "w" parameters
        loss.backward()
        # Step 4 - updates parameters using gradients and
        # the learning rate
        optimizer.step()
        optimizer.zero_grad()
        
        # Returns the loss
        return loss.item()
    
    # Returns the function that will be called inside the 
    # train loop
    return perform_train_step

Press + to interact

%%writefile model_configuration/v1.py
device = 'cuda' if torch.cuda.is_available() else 'cpu'
# Sets learning rate - this is "eta" ~ the "n" like Greek letter
lr = 0.1
torch.manual_seed(42)
# Now we can create a model and send it at once to the device
model = nn.Sequential(nn.Linear(1, 1)).to(device)
# Defines a SGD optimizer to update the parameters 
optimizer = optim.SGD(model.parameters(), lr=lr)
# Defines a MSE loss function
loss_fn = nn.MSELoss(reduction='mean')
# Creates the train_step function for our model, loss function 
# and optimizer
train_step = make_train_step(model, loss_fn, optimizer) # 1)

Introduction

Visualizing Gradient Descent

A Simple Regression Problem

Rethinking the Training Loop

Going Classy

A Simple Classification Problem

Conclusion

Appendix

Rethinking the Training Loop

Training step

Creating the higher-order function for training step

Updating model configuration code