...

/

Understanding the Decoder of the Transformer

Understanding the Decoder of the Transformer

Learn about the structure and components of the decoder of the tranformer.

Suppose we want to translate the English sentence (source sentence) 'I am good' to the French sentence (target sentence) 'Je vais bien'. To perform this translation, we feed the source sentence 'I am good' to the encoder. The encoder learns the representation of the source sentence. We've learned how exactly the encoder learns the representation of the source sentence. Now, we take this encoder's representation and feed it to the decoder. The decoder takes the encoder representation as input and generates the target sentence 'Je vais bien', as shown in the following figure:

Press + to interact
Encoder and decoder of the transformer
Encoder and decoder of the transformer

We learned earlier that instead of having one encoder, we can have a stack of NN encoders. Similar to the encoder, we can also have a stack of NN decoders. For simplicity, let's set N=2N=2. As shown in the following figure, the output of one decoder is sent as the input to the decoder above it. We can also observe that the encoder's representation of the input sentence (encoder's output) is sent to all the decoders. Thus, a decoder receives two inputs: one is from the previous decoder, and the other is the encoder's representation (encoder's output):

Press + to interact
A stack of encoders and decoders
A stack of encoders and decoders

How the decoder generates the target sentence

Okay, but how exactly does the decoder generate the target sentence? Let's explore that in more detail. At time step ...