The Data Augmentation Methods
Learn different methods to perform task-agnostic data augmentation.
We use the following methods for performing task-agnostic data augmentation:
Masking
POS-guided word replacement
n-gram sampling
Let's take a look at each one of them.
Understanding the masking method
In the masking method, with probability
But how is this useful? With the [MASK] token in the sentence, our model will not be able to produce the confidence logits since [MASK] is an unknown token. Our model produces less confident logits for the sentence 'I was listening to [MASK]' with a [MASK] token than for the sentence 'I was listening to music' with the unmasked token. This helps our model understand the contribution of each word to the label.
Understanding the POS-guided word replacement method
In the POS-guided (parts of speech guided) word replacement method, with probability
For example, consider the sentence 'Where did you go?' We know that in this sentence, the word 'did' is a verb. Now we can replace the word 'did' with another verb. So now our sentence becomes 'where do you go?' As you can see, we replaced the word 'did' with 'do' and obtained a new sentence. ...