Is Attention All We Need?
Discover what we can do with attention and try it yourself in this lesson.
In 2017, Google took the concept of attention mechanisms to a whole new level. They moved away from traditional approaches, such as relying on recurrent connections, and instead embraced pure attention mechanisms.
The evolution of attention mechanisms
The sequence-to-sequence models we were familiar with used RNNs for encoding and decoding. As previously mentioned, these models faced fundamental issues. First, the hidden state of the final encoder lacked sufficient information. Moreover, they were slow in processing, as they depended on sequentially processing the input sequence to reach the final state before generating the first output token.
The challenge of sequential processing
Let's look at the encoder-decoder architecture to better understand the challenge of sequential processing.
Get hands-on with 1400+ tech skills courses.