...

/

Applying Positional Encoding and Embedding with Transformers

Applying Positional Encoding and Embedding with Transformers

Learn how to apply positional encoding and embeddings to text using transformers.

Introduction to positional encoding

Understanding the sequence or order of words is essential for interpreting the meaning of words in sentences. Transformer models process input data in parallel, so there is a need to incorporate positional information. Transformers have an advantage over traditional natural language processing techniques for embedding a text or corpus. Right after tokenizing and embedding the text, a positional encoding is applied to the vectors.

Press + to interact
Applying positional encoding using transformers
Applying positional encoding using transformers

Understanding positional encoding

Simply put, positional encoding keeps in memory the order in which the words occur in the sentence. Unlike embeddings, which map words to vectors capturing their semantic meanings, positional encodings address the location of the words. This distinction is essential because the meaning of a sentence can drastically change based on word order, even if the same words are used.

Let’s take a sentence as an example to illustrate the importance of the position of words in a sentence. Sentence 1 is: “Machine learning algorithms find patterns in data.” Notice how the words are spread on the Cartesian plane below and their corresponding x-axis and y-axis coordinates.

Press + to interact
Positional embedding sentence 1
Positional embedding sentence 1

Now, let’s look at sentence 2, which is: “Data patterns find algorithms in machine learning.”

Press + to interact
Positional embedding sentence 2
Positional embedding sentence 2

Both sentences 1 and 2 have the exact same number of words and the exact words, but their order is different. Now, notice how the x-axis and the y-axis coordinates are different, although we have the same words. Words in high dimensional space are affected by their position, and thus their context or meaning is different depending on the word order in the sentence.

Moreover, the same word can have two different meanings depending on the context and order of the sentence. In general, a word's context is determined by the words close to it, the subject, or the verb that precedes or follows it.

Let’s look at another example that demonstrates how the same word can have two different meanings. In the following sentences, “The AI classified the apple as a fruit with 98% confidence” and “Apple invested heavily in AI to improve their virtual assistant’s performance,” the word “apple” is present in both sentences. In the first sentence, “apple” is a fruit, and thus, the positional encoding algorithm pushes the word toward the fruit context. ...