Fill Mask
Learn to fill mask in the text using the Hugging Face Inference API.
We'll cover the following
We mask some words in the fill mask task and ask the transformer model to fill it by suggesting suitable fill-in words. It can find many uses in NLP tasks—for example, correcting the misprinted words in a book, guessing the lost words from ancient hieroglyphs or manuscripts, some word games, or just the language model's analysis itself.
Fill mask using the API
The bert-base-uncased
model is recommended for the fill mask task. However, there are many models available for this task, and some common models are below:
Models for Fill Mask
Model | Description |
| Pretrained on the English dataset, which is uncased. Learns in a bidirectional fashion of the sentence, creating the mask's completion. |
| Pretrained on 104 languages datasets, which are case sensitive. Learns in a bidirectional fashion of the sentence, creating the mask's completion. |
| Based on |
| Based on |
| Pretrained on the French dataset, which is cased. Learns in a bidirectional fashion of the sentence, creating the mask's completion. |
We can call the following endpoint via the POST request method for the fill mask task by replacing the path parameter {model}
with any model mentioned above:
https://api-inference.huggingface.co/models/{model}
Request parameters
The request parameters for this API call are as follows:
Parameter | Type | Category | Description |
| String | Required | Specifies a string or string [ ] is to be filled with a text to complete |
| Boolean | Optional | Hugging Face Inference API has a cache mechanism implemented to speed up the requests. Use it for the deterministic models. Default value is |
| Boolean | Optional | Hugging Face Inference API models takes time to initiate and process the requests. If the value is |
The widget below presents the code for the fill mask task.
// Endpoint URLconst endpointUrl = "https://api-inference.huggingface.co/models/bert-base-uncased";const headerParameters = {"Authorization": "Bearer {{ACCESS_TOKEN}}"};// Input text to classifyconst data = JSON.stringify({inputs: "DNA is the [MASK] of life.",options: {wait_for_model: true}});const options = {method: "POST",headers: headerParameters,body: data};async function fillMask() {try {const response = await fetch(endpointUrl, options);printResponse(response);} catch (error) {printError(error);}}fillMask();
Let’s have a look at the highlighted lines shown in the code widget above:
Line 2: We specify the
bert-base-uncased
model for fill mask task.Line 9: We specify the
inputs
, which include a sentence with the[MASK]
.Line 21–28: We create a function,
fillMask
, to make the API call and handle the exceptions.Line 30: We call the
fillMask
function to invoke the endpoint.
Response fields
The API call above returns a dictionary object or a list of dictionary objects, depending on the inputs
. The response contains the following fields.
Parameter | Type | Description |
| String | Specifies the complete phrase after filling the mask |
| Float | The likelihood of the token in the sentence. A higher value means the token has a higher probability of occurring in the sentence, and vice versa. |
| Integer | Specifies the ID of the token, and is the model-generated ID |
| String | Specifies the token that is filled in the place of the mask |
Examples
Try out the following examples, and choose the best model that can perform the following tasks. Change the inputs
at line 9 in the code widget above with the examples provided below.
Fill the mask in the French language:
inputs: "L’amour est comme un arbre, il pousse de lui-même, jette <special1> ses racines dans tout notre être"
Note: Please notice the use of the
<special1>
tag here for the mask. When we use the[MASK]
tag, it leads to an error.
Fill the mask in the protein sequence:
inputs: "D L I P T S S K L V V [MASK] D T S L Q"