Self-Attention Mechanism
Learn about the components of self-attention mechanism in this lesson.
Let's explore a unique mechanism that's especially important when dealing with images—the self-attention mechanism.
Attention is a fundamental concept in deep learning, often described using query and value vectors. Now, we'll introduce another vector known as the key vector.
Understanding the self-attention mechanism
As we learn about the self-attention mechanism, its terminology is fundamental for grasping the intricacies of its powerful concept:
Query (
): This is what we're looking for or trying to match with. Key (
): This is what we use to identify or locate the specific thing we're interested in. Value (
): This is the actual content or information we obtain when we successfully match the query with the key.
The query represents the vector we want to associate with all input values. In our earlier example, we used the decoder token as the query and the encoder tokens as the values. In our current example, we're focusing on understanding the connections of a specific word, like "it," with other vectors.
Ideally, the most significant connection should be with a word like "robot" because "it" refers to "robot." Queries and values are two different ways of looking at the same thing. Values are vector representations of related tokens used to measure the similarity between the ...