Transformers for Computer Vision Applications/

...

Self-Attention Mechanism

Learn about the components of self-attention mechanism in this lesson.

We'll cover the following...

Understanding the self-attention mechanism
The role of keys in the self-attention mechanism
Visualizing self-attention with attention maps
- A simple code implementation

Let's explore a unique mechanism that's especially important when dealing with images—the self-attention mechanism.

Attention is a fundamental concept in deep learning, often described using query and value vectors. Now, we'll introduce another vector known as the key vector.

Understanding the self-attention mechanism

As we learn about the self-attention mechanism, its terminology is fundamental for grasping the intricacies of its powerful concept:

Query ( $Q$ ): This is what we're looking for or trying to match with.
Key ( $K$ ): This is what we use to identify or locate the specific thing we're interested in.
Value ( $V$ ): This is the actual content or information we obtain when we successfully match the query with the key.

The query represents the vector we want to associate with all input values. In our earlier example, we used the decoder token as the query and the encoder tokens as the values. In our current example, we're focusing on understanding the connections of a specific word, like "it," with other vectors.

Ideally, the most significant connection should be with a word like "robot" because "it" refers to "robot." Queries and values are two different ways of looking at the same thing. Values are vector representations of related tokens used to measure the similarity between the ...

Introduction

Overview of Transformer Networks

Neural Machine Translation with a Transformer and Keras

Transformers in Computer Vision

Vision Transformer for Image Classification

Transformers in Image Classification

Fine-Tuning Vision Transformers for Image Classification

Transformers in Object Detection

Transformers in Semantic Segmentation

Spatio-Temporal Transformers

Object Detection with Vision Transformers

Wrap Up

Self-Attention Mechanism

Understanding the self-attention mechanism