Quiz: Architecture of the Transformer Model

Test yourself on the concepts you learned this chapter.

1

What is not a feature of multi-head attention?

A)

A broader, in-depth analysis of sequences

B)

The preclusion of recurrence reducing calculation operations

C)

The presence of a softmax layer, normalizing embedding calculations

D)

Implementation of parallelization which reduces training time

Question 1 of 50 attempted

Get hands-on with 1400+ tech skills courses.