Text Generation

Learn to perform text generation using the Hugging Face Inference API.

OpenAI introduced the generative pre-trained transformers (GPT) models in 2018. These models provide unsupervised pretraining, which enables us to leverage heaps of text on the internet without spending our resources on annotations. GPT was succeeded by GPT-2 in 2019 with 1.5 billion parameters. GPT-3 is the latest model in the GPT family. It has 175 billion parameters and allows us to develop excellent applications.

While there are a lot of tasks available in the GPT family of models, its tendency to generate long text from a brief preamble is unparalleled.

Press + to interact

Generate text using the API

Since GPT-3 is a closed source, we'll use GPT-2, which is an efficient model itself. The gpt2 model is recommended for the text generation tasks by Hugging Face. However, there are many models available for this task, and some common models are below:

Models for Text Generation

Model

Description

gpt2

Transformers-based model trained on large-scale datasets. Trained on English datasets without any labeled data. Has the ability to manipulate and generate the text from the provided instructions and the text.

Michau/t5-base-en-generate-headline

Trained on around 500 thousand articles. Highlights the headings from the articles.

EleutherAI/gpt-j-6B

Trained using Mesh Transformer JAX on the around six billion parameters on the Pile dataset. The gpt-j-6b model is used for text generation.

facebook/opt-350m

Based on the pretrained OPT models and fine-tuned on the five English datasets of around 800 gigabytes. Main purpose is to generate text.

bigscience/T0

Trained on large-scale datasets from different sources of English. Outperforms gpt-3 in some cases and is very small compared to gpt-3.

Note: The model size is around 41.5 gigabytes, and it takes too long to load it.

We can call the following endpoint via the POST request method by replacing the path parameter {model} with any model mentioned above:

https://api-inference.huggingface.co/models/{model}

Request parameters

The request parameters for this API call are as follows:

Parameter

Type

Category

Description

inputs

String

Required

String that instructs to generate the new string

parameters.top_k

Integer

Optional

Specifies how many tokens will be considered during text creation from the input text

parameters.top_p

Float

Optional

Specifies the probability of the token to be added in summary from the most probable to less probable until the sum of the probabilities is higher than top_p

parameters.temperature

Float

Optional

Specifies a sampling technique, and value ranges from 1.0 to 100.0. Setting temperature to 0 takes tokens with the highest probability. If we set temperature to 1, it does regular sampling. At 100.0 temperature, it selects the tokens with uniform probability.

parameters.max_length

Integer

Optional

Specifies the number of maximum tokens to be included in the output text

parameters.num_return_sequences

Integer

Optional

Specifies the number of results we want to be returned.

options.use_cache

Boolean

Optional

Hugging Face Inference API has a cache mechanism implemented to speed up the requests. Use it for the deterministic models. Default value is true.

options.wait_for_model

Boolean

Optional

Hugging Face Inference API models takes time to initiate and process the requests. If the value is true, it waits for the model to get ready instead of returning an error. Default value is false.

The following code generates text according to the context of the provided text.

Press + to interact
// Endpoint URL
const endpointUrl = "https://api-inference.huggingface.co/models/gpt2";
const headerParameters = {
"Authorization": "Bearer {{ACCESS_TOKEN}}"
};
// Input text to classify
const data = JSON.stringify({
inputs: "Generative Pre-trained Transformer 3 (GPT-3) is an autoregressive \
language model that uses deep learning to produce human-like text.",
parameters: {
max_length: 32
},
options: {
wait_for_model: true
}
});
const options = {
method: "POST",
headers: headerParameters,
body: data
};
async function textGeneration() {
try {
const response = await fetch(endpointUrl, options);
printResponse(response);
} catch (error) {
printError(error);
}
}
textGeneration();

Let’s have a look at the highlighted lines shown in the code widget above:

  • Line 2: We specify the gpt2 model for text generation.

  • Lines 8–17: We set the inputs with a text to make text generation and set max_length for the output text.

  • Lines 25–32: We create a function, textGeneration, to make the API call and handle the exceptions.

  • Line 34: We call the textGeneration function to invoke the endpoint.

Response fields

The API call above returns a dictionary object or a list of dictionary objects, depending on the inputs. The response contains the following field.

Parameter

Type

Description

generated_text

String

Generated text from the inputs

Examples

Text generation can be helpful in several tasks, like text question-answering and language translation. Change the inputs at lines 9–10 in the code widget above with the examples provided below.

We provide the model a question, backed by some context, and it returns us the answer.

Press + to interact
inputs: "Context: Richard Feynman was a Physicist. Being one of the most famous scientist ever, he is still remembered in the scientific society.\
Question: Who was Richard Feynman?"

We can also perform translations like:

Press + to interact
inputs: "Translate from English to French: The lesson is finished. Jump to the notebook now."

Let's run the same examples by providing some optional parameters, and observe the effect of these parameters on the generated text.

Press + to interact
// Endpoint URL
const endpointUrl = "https://api-inference.huggingface.co/models/Michau/t5-base-en-generate-headline";
const headerParameters = {
"Authorization": "Bearer {{ACCESS_TOKEN}}"
};
// Input text to classify
const data = JSON.stringify({
inputs: "Context: Richard Feynman was a Physicist. Being one of the most famous scientist ever, he is still remembered in the scientific society.\
Question: Who was Richard Feynman?",
parameters: {
max_length: 50,
temperature: 0.1
},
options: {
wait_for_model: true
}
});
const options = {
method: "POST",
headers: headerParameters,
body: data
};
async function textGeneration() {
try {
const response = await fetch(endpointUrl, options);
printResponse(response);
} catch (error) {
printError(error);
}
}
textGeneration();

We use the Michau/t5-base-en-generate-headline model and set values of max_length to 50 and temperature to 0.1.