Transformer Architecture: Embedding Layers
Learn about the embedding layers in the transformer.
We'll cover the following...
Word embeddings provide a semantic-preserving representation of words based on the context in which words are used. In other words, if two words are used in the same context, they will have similar word vectors. For example, the words “cat” and “dog” will have similar representations, whereas “cat” and “volcano” will have vastly different representations.
Word vectors were initially introduced in the paper titled