...

/

Generator and Discriminator of the ELECTRA Model

Generator and Discriminator of the ELECTRA Model

Learn the working of the generator and discriminator model in detail and find out why we should prioritize using ELECTRA over BERT.

The generator model

First, let's have a look at the generator. The generator performs the MLM task. We randomly mask a few tokens with a 15% mask rate and train the generator to predict the masked token. Let's represent our input tokens as X=[x1,x2,...,xn]X = [x_1, x_2, ..., x_n]. We randomly mask some tokens and feed them as input to the generator, which returns the representation of each of the tokens. Let hG(X) ...