Input Embedding

Learn about input embeddings and how they are used in transformers in this lesson.

We'll cover the following

This global view of the encoder shows the highly optimized architecture of the transformer. Now, let's zoom into the sublayers and mechanisms—we'll begin with the embedding sublayer.

Input Embedding


The input embedding sublayer converts the input tokens to vectors of dimension dmodel = 512d_{model} \text{ = 512} using learned embeddings in the original transformer model. The structure of the input embedding is classical:

Get hands-on with 1400+ tech skills courses.