Backpropagation through Time
Learn about backpropagation and how it works through time.
For training RNNs, a special form of backpropagation known as backpropagation through time (BPTT) is used. To understand BPTT, however, first, we need to understand how backpropagation (BP) works. Then, we’ll discuss why BP can’t be directly applied to RNNs but how BP can be adapted for RNNs, resulting in BPTT. Finally, we’ll discuss two major problems present in BPTT.
How backpropagation works
Backpropagation is the technique that’s used to train a feed-forward neural network. In backpropagation, we do the following:
Calculate a prediction for a given input.
Calculate an error,
, of the prediction by comparing it to the actual label of the input (for example, mean squared error and cross-entropy loss). Update the weights of the feed-forward network to minimize the loss calculated in step 2 by taking a small step in the opposite direction of the gradient
for all , where is the weight of the layer.
To understand the computations above more clearly, consider the feed-forward network depicted in the figure below. This has two single weights,
Get hands-on with 1400+ tech skills courses.