Summary: Using Transformers to Generate Text
Get a quick recap of the major learning points in this chapter.
In this chapter, we introduced some of the core ideas that have dominated recent models for NLP, like the attention mechanism, contextual embeddings, and self-attention. We then used this foundation to learn about the transformer architecture and its internal components. We briefly discussed BERT and its family of architectures.
Next, we discussed transformer-based language models from OpenAI. We discussed the architectural and dataset-related choices for GPT and GPT-2. We also used the transformer package from Hugging Face to develop our own GPT-2-based text generation pipeline.
Get hands-on with 1400+ tech skills courses.