BERT

One of the biggest challenges in NLP is the lack of enough training data. Overall there is an enormous amount of text data available, but if we want to create task-specific datasets, we need to split that pile into very many diverse fields. When we do this, we end up with only a few thousand or a few hundred thousand human-labeled training examples. Unfortunately, in order to perform well, deep learning-based NLP models require much larger amounts of data; they see major improvements when trained on millions, or billions, of annotated training examples.

To help bridge this gap in data, researchers have developed various techniques for training general-purpose language representation models using the enormous piles of unannotated text on the web ...

The Fundamentals

AI in Practice

Real Case Studies

Responsible AI

What's Next

Natural Language Processing: BERT

BERT