...

/

Self-Attention vs. Convolution

Self-Attention vs. Convolution

Explore self-attention's role in computer vision for feature detection and global relationships between patches.

Let's explore how we can employ self-attention in computer vision.

Comparing self-attention and convolution in computer vision

The process of generating self-attended feature maps involves a series of transformations applied to a 3D image representation, denoted as XX. Initially, XX is a 2D image transformed by weight matrices WW, with an added third dimension representing channel information, such as different colors.

The first step involves the extraction of three sets of weight matrices, namely WkW_k, WqW_q, and WvW_v. Subsequently, these weight matrices are applied to the original image representation XX to form three distinct matrices: KK, QQ, and VV ...

Access this course and 1400+ top-rated courses and projects.