Tune Learning Rate and Batch Size
Learn what happens when we tweak the learning rate and batch size while training a neural network.
We'll cover the following
Tune the learning rate
We’ll use our old hyperparameter called lr
. This hyperparameter has been with us since almost the beginning of this course. Chances are, we already tuned it, maybe by trying a few random values. It’s time to be more precise about lr
tuning.
To understand the trade-off of different learning rates, let’s go back to the basics and visualize gradient descent. The following diagrams show a few steps of GD along a one-dimensional loss curve, with three different values of lr
. The red cross marks the starting point, and the green cross marks the minimum:
Create a free account to view this lesson.
By signing up, you agree to Educative's Terms of Service and Privacy Policy