Zero, One, and Few-Shot Learning: Typical of Transformers Mode

Explore diverse possibilities to control output format, tone, and style and understand the importance of effective, prompt design for accurate responses in ChatGPT.

In the previous sections, we mentioned how OpenAI models and ChatGPT come in a pre-trained format. They have been trained on a huge amount of data and have had their (billions of) parameters configured accordingly.

However, this doesn’t mean that those models can’t learn anymore. In theOpenAI and ChatGPT—Beyond the Market Hype” chapter, we learned that one way to customize an OpenAI model and make it more capable of addressing specific tasks is by fine-tuning.

Fine-tuning is a proper training process that requires a training dataset, compute power, and some training time (depending on the amount of data and compute instances).

Shot learning

This is why it is worth testing another method for our model to become more skilled in specific tasks: shot learning.

The idea is to let the model learn from simple examples rather than the entire dataset. Those examples are samples of how we would like the model to respond, so the model not only learns the content but also the format, style, and taxonomy to use in its response.

Furthermore, shot learning occurs directly via the prompt (as we will see in the following scenarios), so the whole experience is less time-consuming and easier to perform.

The number of examples provided determines the level of shot learning we are referring to. In other words, we refer to zero-shot if no example is provided, one-shot if one example is provided, and few-shot if more than 2–3 examples are provided.

Types of shot learning

Let’s focus on each of those scenarios:

  • Zero-shot learning: In this type of learning, the model is asked to perform a task for which it has not seen any training examples. The model must rely on prior knowledge or general information about the task to complete the task. For example, a zero-shot learning approach could be that of asking the model to generate a description, as defined in the prompt below:

Get hands-on with 1400+ tech skills courses.