Quiz: Architecture of the Transformer Model
Test yourself on the concepts you learned this chapter.
1
What is not a feature of multi-head attention?
A)
A broader, in-depth analysis of sequences
B)
The preclusion of recurrence reducing calculation operations
C)
The presence of a softmax layer, normalizing embedding calculations
D)
Implementation of parallelization which reduces training time
Question 1 of 50 attempted
Get hands-on with 1400+ tech skills courses.