...

/

Local vs. Global Attention

Local vs. Global Attention

Explore the distinctions between global and local attention mechanisms, uncovering the efficiency and dynamic nature of local attention.

We've previously explored global attention mechanisms, which establish connections across all inputs—be they spatial, channel-related, or temporal. Now, let's explore another critical aspect: local attention.

Local attention mechanism

As known, convolution is a local operation, due to its inductive bias or modeling assumption, while attention was identified as global, devoid of modeling assumptions, or low in inductive bias. Spatial attention, as depicted, links each blue pixel in space to a red pixel, capturing their relationship through an attention map. This is known as non-local attention, although other options are available.

Press + to interact
Non-local attention block
Non-local attention block

The matrix depicted in the above illustration represents the attention distribution within a spatial context. Each element in the matrix corresponds to a position in the input space, and the intensity of the connections between elements is visually represented by the color scale.

The gray matrix in the lower middle signifies a non-local attention pattern. Unlike local operations such as convolution, where interactions are confined to a specific neighborhood, non-local attention allows each position in the input space to contribute to the attention mechanism without restrictions.

Non-local attention

In the world of self-attention mechanisms, two fundamental design approaches emerge, global self-attention and local self-attention.

Global self-attention, as the name implies, operates without constraints imposed by input feature size. It encompasses the entire feature map, allowing each position to attend to every other position within the map. ...

Access this course and 1400+ top-rated courses and projects.