Generic Text Completion with GPT-2

Learn about how generic text completion is performed with a pretrained GPT-2 model.

It is now time to see how the source code of GPT models is built. Although the GPT-3 transformer model source code is not publicly available at this time, GPT-2 models are sufficiently powerful to understand the inner workings of GPT models.

We are ready to interact with a GPT-2 model and train it.

We will first use a trained GPT-2 345M model for text completion with 24 decoder layers with self-attention sublayers of 16 heads. We will then train a GPT-2 117M model for customized text completion with 12 decoder layers with self-attention layers of 12 heads. We will explore an example with a GPT-2 generic model from top to bottom. The goal of the example we will run is to determine the level of abstract reasoning a GPT model can attain.

This section describes the interaction with a GPT-2 model for text completion. We will focus on Step 6 of the OpenAI_GPT_2.ipynb Jupyter notebook in the "Code playground" section.

First, let’s understand the specific example of the pretrained GPT-2 being applied.

Step 6: Interacting with GPT-2

In this section, we will interact with the GPT-2 345M model.

To interact with the model, we can use the cell below:

Get hands-on with 1400+ tech skills courses.