Evolving From Fine-Tuning to Zero-Shot Models

Learn about how fine-tuning evolved into zero-shot transformer models.

Overview

From the start, OpenAI’s research teams, led by Radford et al. (2018), wanted to take transformers from trained models to GPT models. The goal was to train transformers on unlabeled data. Letting attention layers learn a language from unsupervised data was a smart move. Instead of teaching transformers to do specific NLP tasks, OpenAI decided to train transformers to learn a language.

OpenAI wanted to create a task-agnostic model. So, they began to train transformer models on raw data instead of relying on labeled data by specialists. Labeling data is time-consuming and considerably slows down the transformer’s training process.

Get hands-on with 1200+ tech skills courses.