Optimization

Learn about loss functions and optimizing neural network weights.

Chapter Goals:

  • Know the relationship between training, weights, and loss
  • Understand the intuitive definition of loss
  • Obtain the model's loss from logits
  • Write a training operation to minimize the loss

A. What is training?

In Chapter 3, we discussed the weights associated with connections between neurons. These weights determine what a neural network outputs based on the input data. However, these weights are what we call trainable variables, meaning that we need to train our neural network to find the optimal weights for each connection.

For any neural network, training involves setting up a loss function. The loss function tells us how bad the neural network's output is compared to the actual labels.

Since a larger loss means a worse model, we want to train the model to output values that minimize the loss function. The model does this by learning the optimal weight settings. Remember, the weights are just real numbers, so the model is essentially just figuring out the best numbers to set the weights to.

B. Loss as error

In regression problems, common loss functions are the L1 norm:

iactualipredictedi\sum_i |actual_i - predicted_i| ...