Transformer Visualization via Dictionary Learning
Learn about how transformer layers are visualized through dictionary learning.
Transformer visualization via dictionary learning is based on transformer factors.
Transformer factors
A transformer factor is an embedding vector that contains contextualized words. A word with no context can have many meanings, creating a polysemy issue. For example, the word “separate” can be a verb or an adjective. Furthermore, separate can mean disconnect, discriminate, scatter, and has many other definitions.
Yun et al., 2021, therefore created an embedding vector with contextualized words in the paper “separate
can be represented as:
Get hands-on with 1400+ tech skills courses.