Putting the Encoder and Decoder Together
Learn how the transformer works with the encoder and decoder and how to train it.
We'll cover the following...
We feed the decoder representation of the target sentence to the linear and softmax layers and get the predicted word. To give more clarity, the complete transformer architecture with the encoder and decoder is shown in the following figure:
Press + to interact
In the preceding figure, Nx denotes that we can stack