Text Classification
Learn to perform text classification tasks using the Hugging Face Inference API.
Text classification can be used to infer the type of the given text. For example, it can help determine if a book is a success based on the reviews, determine whether the reviews are positive or negative, determine an author's tone in a passage, or verify whether a sentence or passage is grammatically correct.
Sentiment analysis
Have you ever wondered how companies like Amazon know if a certain product is a success or a flop based on customer reviews? Thanks to NLP, we can perform sentiment analysis. In sentiment analysis, we take a sentence and infer if it's positive, negative, or neutral.
The distilbert-base-uncased-finetuned-sst-2-english
model is recommended for text classification. However, there are many models available for this task, and some common models are below:
Models for Text Classification
Model | Description |
| Developed by the Hugging Face team for text classification. Base model is |
| Trained on around 58 million tweets for sentiment analysis. Labels will be predicted from "Negative," "Positive," and "Neutral." Works well for the English language. |
| This model is for financial text classification. The model outputs "positive," "neutral," and "negative" labels. |
| Trained on around 58 million tweets for sentiment analysis and fine-tuned for emotional text analysis. Outputs labels like "anger," "joy," "sadness," and others. |
| Trained on around 40 thousand tweets for sentiment analysis. The model outputs "POS," "NEG," and "NEU." |
We can call the following endpoint via the POST request method for text classification by replacing the path parameter {model}
with any model mentioned above:
https://api-inference.huggingface.co/models/{model}
Request parameters
The endpoint above takes the following request parameters:
Parameter | Type | Category | Description |
| String | Required | A string or string [ ] to be classified |
| Boolean | Optional | Hugging Face Inference API has a cache mechanism implemented to speed up the requests. Use it for the deterministic models. Default value is |
| Boolean | Optional | Hugging Face Inference API models takes time to initiate and process the requests. If the value is |
The code below gives an example of text classification. Below, we apply it to one of the most iconic opening lines from Herman Melville's classic, Moby-Dick (1851):
// Endpoint URLconst endpointUrl = "https://api-inference.huggingface.co/models/distilbert-base-uncased-finetuned-sst-2-english";const headerParameters = {"Authorization": "Bearer {{ACCESS_TOKEN}}"};// Input text to classifyconst data = JSON.stringify({inputs: "Call me Ishmael. Some years ago—never \mind how long precisely—having little or no money in my purse, \and nothing particular to interest me on shore, I thought I would \sail about a little and see the watery part of the world. It is a \way I have of driving off the spleen, and regulating the circulation.",options: {wait_for_model: true}});const options = {method: "POST",headers: headerParameters,body: data};async function textClassification() {try {const response = await fetch(endpointUrl, options);printResponse(response);} catch (error) {printError(error);}}textClassification();
Let’s have a look at the highlighted lines shown in the code widget above:
Line 2: We specify the endpoint URL with the
distilbert-base-uncased-finetuned-sst-2-english
model.Lines 8–17: We provide the input text for the classification and set an additional
options.wait_for_model
parameter totrue
.Lines 25–32: We create a function,
textClassification
, to call the API and handle exceptions.Line 34: We call the
textClassification
function to invoke the endpoint.
Response fields
This API call returns a dictionary object or list of dictionary objects containing possible labels ordered by the likelihood scores of the text. The labels are model specific, and if we use other models that can do classification, they return labels other than "Positive" and "Negative."
Parameter | Type | Description |
| String | Specifies predicted label of the text |
| Float | Specifies likelihood score of the text |
Try the following examples in the widget above by changing the inputs in lines 9–13.
inputs: "Dickens is one of those writers who are \well worth stealing. Even the burial of his body in Westminster \Abbey was a species of theft, if you come to think of it"
The following is an example of classifying multiple texts in one API call.
inputs: ["To be, or not to be","With mirth and laughter let old wrinkles come"]
Zero-shot classification
Zero-shot classification is an ML technique in which we provide the model with labels that were not seen during the training. It can infer the unseen labels from the data on which it's trained by learning the concepts. When we want to classify a short text or large text with respect to the user-provided labels, the recommended model for this task is facebook/bart-large-mnli
. There are many models available for this task, and some common models are below:
Models for Zero-Shot Classification
Model | Description |
| Trained on the |
| Trained on the dataset of 15 languages and based on the |
| Trained by taking the alternative layers of the |
| Based on |
We can call the following endpoint via the POST request method for the text classification by replacing the path parameter {model}
with any model mentioned above:
https://api-inference.huggingface.co/models/{model}
Request parameters
The API call above takes the following request parameters:
Parameter | Type | Category | Description |
| String | Required | A string or string [ ] to be classified |
| String [ ] | Required | A list of candidate labels for |
| Boolean | Optional | Default value is |
| Boolean | Optional | Hugging Face Inference API has a cache mechanism implemented to speed up the requests. Use it for the deterministic models. Default value is |
| Boolean | Optional | Hugging Face Inference API models takes time to initiate and process the requests. If the value is |
The code below classifies the text for the provided candidate_labels
. We have provided the candidate_labels
on line 13.
// Endpoint URLconst endpointUrl = "https://api-inference.huggingface.co/models/facebook/bart-large-mnli";const headerParameters = {"Authorization": "Bearer {{ACCESS_TOKEN}}"};// Input text to classifyconst data = JSON.stringify({inputs: "Machine Learning skills are some of the most sought-after in the modern \job market. Modern ML Engineers make dozens of thousands of dollars more per year \than other developers. ",parameters:{candidate_labels:["Software Engineer", "Machine Learning Engineer", "Electrical Engineer"]},options: {wait_for_model: true}});const options = {method: "POST",headers: headerParameters,body: data};async function textClassification() {try {const response = await fetch(endpointUrl, options);printResponse(response);} catch (error) {printError(error);}}textClassification();
Response fields
This API call returns a dictionary object or list of dictionary objects containing possible labels ordered by the likelihood scores of the input texts and the text provided for the classification.
Parameter | Type | Description |
| String | Specifies the input text whose label is predicted |
| String [ ] | An array of the labels that we have provided for the zero-shot classification |
| Float [ ] | An array of the likelihood scores corresponding to the |