Token Classification
Learn to perform token classification tasks using the Hugging Face Inference API.
It can be difficult to understand natural languages. We're required to perform some preprocessing before inputting the data into an NLP model. Tokenization allows us to demarcate parts of a sentence.
Hugging Face also allows us to perform classification on these tokens. There are a couple of popular subtasks:
Named entity recognition (NER)
Part-of-speech (POS) tagging
Named entity recognition (NER)
In named entity recognition (NER), also known as entity identification, the classifier returns the key information (entities). The dbmdz/bert-large-cased-finetuned-conll03-english
model is recommended for NER tasks. There are many models available for this task, and some common models are below:
Models for NER
Model | Description |
| Based on |
| Based on |
| Trained on the |
| Based on |
In the table above, "O" is the outside name entity, "MISC" is the miscellaneous entity, "PER" is the person's name, "ORG" is the organization, and "LOC" is the location.
We can call the following endpoint via the POST request method for NER tasks by replacing the path parameter {model}
with any model mentioned above:
https://api-inference.huggingface.co/models/{model}
Request parameters
The request parameters for this API call are as follows:
Parameter | Type | Category | Description |
| String | Required | Specifies a string or string [ ] to be classified |
| Boolean | Optional | Hugging Face Inference API has a cache mechanism implemented to speed up the requests. Use it for the deterministic models. Default value is |
| Boolean | Optional | Hugging Face Inference API models takes time to initiate and process the requests. If the value is |
The code below is an example of NER.
// Endpoint URLconst endpointUrl = "https://api-inference.huggingface.co/models/dbmdz/bert-large-cased-finetuned-conll03-english";const headerParameters = {"Authorization": "Bearer {{ACCESS_TOKEN}}"};// Input text to classifyconst data = JSON.stringify({inputs: "Ptolemy mentions in his Geographia a city called Labokla which \may have been in reference to ancient Lahore.",options: {wait_for_model: true}});const options = {method: "POST",headers: headerParameters,body: data};async function tokenClassification() {try {const response = await fetch(endpointUrl, options);printResponse(response);} catch (error) {printError(error);}}tokenClassification();
Let’s have a look at the highlighted lines shown in the code widget above:
Line 2: We specify the endpoint URL with the
dbmdz/bert-large-cased-finetuned-conll03-english
model for the NER task.Lines 9–10: We provide input text for the text (NER) classification.
Lines 22–29: We create a function,
tokenClassification
, to call the API and handle exceptions.Line 31: We call the
tokenClassification
function to invoke the endpoint.
Response fields
The API call above returns a dictionary object or a list of dictionary objects, depending on the inputs
. The response contains the following fields.
Parameter | Type | Description |
| String | Specifies the type of the recognized entity |
| Float | Specifies the likelihood of the entity |
| String | Specifies the string whose entity group is to be recognized |
| Integer | The starting index of the string. Helpful in case of multiple occurrences of the string. |
| Integer | The ending index of the string. Helpful in case of multiple occurrences of the string. |
Part-of-speech (POS) tagging
An NLP model can be facilitated by classifying tokens into the respective parts of speech. We can use POS tagging for this task. The vblagoje/bert-english-uncased-finetuned-pos
model is recommended for POS tagging tasks. However, there are many models available for this task, and some common models are below:
Models for POS Tagging
Model | Description |
| Based on |
| Based on |
We can call the following endpoint via the POST request method for POS tagging by replacing the path parameter {model}
with any model mentioned above:
https://api-inference.huggingface.co/models/{model}
Request parameters
The request parameters of this API call are the same as for the previous API call. The code below is an example of POS tagging:
// Endpoint URLconst endpointUrl = "https://api-inference.huggingface.co/models/vblagoje/bert-english-uncased-finetuned-pos";const headerParameters = {"Authorization": "Bearer {{ACCESS_TOKEN}}"};// Input text to classifyconst data = JSON.stringify({inputs: "This chapter will provide an overview of performing common NLP tasks.",options: {wait_for_model: true}});const options = {method: "POST",headers: headerParameters,body: data};async function posTagging() {try {const response = await fetch(endpointUrl, options);printResponse(response);} catch (error) {printError(error);}}posTagging();
Let’s have a look at the highlighted lines shown in the code widget above:
Line 2: We specify the endpoint URL with the
vblagoje/bert-english-uncased-finetuned-pos
model for the POS tagging task.Line 9: We provide input text for the POS tagging task.
Lines 21–28: We create a function,
posTagging
, to call the API and handle exceptions.Line 30: We call the
posTagging
function to invoke the endpoint.
Response fields
The response fields of this API call are the same as for the previous API call.