Introduction: Applying Transformers for AI Text Summarization

Get an overview of what we will cover this chapter.

We'll cover the following

So far, we have explored the architecture, training, fine-tuning, and usage of several transformer ecosystems. In the previous chapter, we discovered that OpenAI has begun to experiment with zero-shot models that require no fine-tuning, no development, and can be implemented in a few lines.

The underlying concept of such an evolution relies on how transformers strive to teach a machine how to understand a language and express itself in a human-like manner. Therefore, we have gone from training a model to teaching languages to machines.

Raffel et al. (2019) designed a transformer meta-model in the paper “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer”The paper can be accessed at: https://arxiv.org/pdf/1910.10683.pdf based on a simple assertion: every NLP problem can be represented as a text-to-text function. Every type of NLP task requires some kind of text context that generates some form of text response.

A text-to-text representation of any NLP task provides a unique framework to analyze a transformer’s methodology and practice. The idea is for a transformer to learn a language through transfer learning during the training and fine-tuning phases with a text-to-text approach.

Raffel et al. (2019) named this approach a Text-To-Text Transfer Transformer. The 5 Ts became T5, and a new model was born.

Chapter overview

We will begin this chapter by going through the concepts and architecture of the T5 transformer model. We will then apply T5 to summarize documents with Hugging Face models.

Finally, we will transpose the text-to-text approach to the show-and-context process of GPT-3 engine usage. The mind-blowing, though not perfect, zero-shot responses exceed anything a human could imagine.

This chapter covers the following topics:

Get hands-on with 1200+ tech skills courses.