Generic Text Completion with GPT-2
Learn about how generic text completion is performed with a pretrained GPT-2 model.
It is now time to see how the source code of GPT models is built. Although the GPT-3 transformer model source code is not publicly available at this time, GPT-2 models are sufficiently powerful to understand the inner workings of GPT models.
We are ready to interact with a GPT-2 model and train it.
We will first use a trained GPT-2 345M model for text completion with 24 decoder layers with self-attention sublayers of 16 heads. We will then train a GPT-2 117M model for customized text completion with 12 decoder layers with self-attention layers of 12 heads. We will explore an example with a GPT-2 generic model from top to bottom. The goal of the example we will run is to determine the level of abstract reasoning a GPT model can attain.
This section describes the interaction with a GPT-2 model for text completion. We will focus on Step 6 of the OpenAI_GPT_2.ipynb
Jupyter notebook in the "Code playground" section.
First, let’s understand the specific example of the pretrained GPT-2 being applied.
Step 6: Interacting with GPT-2
In this section, we will interact with the GPT-2 345M model.
To interact with the model, we can use the cell below:
Get hands-on with 1400+ tech skills courses.