Object Detection
Learn how to use object detection using the Hugging Face Inference API.
We'll cover the following
Object detection is similar to image classification, but we’re interested in particular object instances. It can be used for object tracking and image annotation as well. This API call takes an image as input and returns the likelihood scores of the labels and the bounding boxes around the detected object as a result of object detection.
Finding objects in an image using the API
The facebook/detr-resnet-50
model is recommended for the object detection task. However, there are many models available for this task, and some common models are below:
Models for Object Detection
Model | Description |
| Built by the combination of a convolution neural network and transformer with two attention heads. One is for object detection, the other for the boundary detection of the object. Trained on 118 thousand labeled images. |
| Trained on the COCO 2017 dataset. Contains around 118 thousand labeled images. Can be used for object detection. |
We can call the following endpoint via the POST request method for object detection by replacing the path parameter {model}
with any model mentioned above:
https://api-inference.huggingface.co/models/{model}
Request parameters
This endpoint takes only the binary representation of an image file.
The code below detects a cat in the image. We can take input directly from the URL.
// Endpoint URLconst endpointUrl = "https://api-inference.huggingface.co/models/facebook/detr-resnet-50";const headerParameters = {"Authorization": "Bearer {{ACCESS_TOKEN}}"};async function detectObjects(imgUrl) {try {// Reading image from URLconst imgResponse = await fetch(imgUrl);const buffer = await imgResponse.buffer();const options = {method: "POST",headers: headerParameters,body: buffer};// Calling endpoint URL for object detectionconst response = await fetch(endpointUrl, options);printResponse(response);} catch (error) {printError(error);}}// URL of the imagelet imgUrl = "https://images.unsplash.com/photo-1604675223954-b1aabd668078";detectObjects(imgUrl);
We specify the endpoint URL with the facebook/detr-resnet-50
model at line 2.
Response fields
This API call returns a dictionary object containing possible labels ordered by the likelihood scores and the bounding boxes of the detected objects in the image.
Parameter | Type | Description |
| Float | Specifies the likelihood score of the label |
| String | Specifies the predicted label of the image |
| Object | Contains the bounding box coordinates of the detected object. Coordinates can be accessed by the names |
//Example#1let imgUrl = "https://images.unsplash.com/photo-1599889959407-598566c6e1f1?ixlib=rb-4.0.3&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=500&q=40"//Example#2let imgUrl = "https://images.unsplash.com/photo-1643251935745-4209d215f221?ixlib=rb-4.0.3&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=1200&q=40"
Try out the images above by replacing the imgUrl
at line 29 in the widget below, and observe the results.
// Endpoint URLconst endpointUrl = "https://api-inference.huggingface.co/models/facebook/detr-resnet-50";const headerParameters = {"Authorization": "Bearer {{ACCESS_TOKEN}}"};async function detectObjects(imgUrl) {try {// Reading image from URLconst imgResponse = await fetch(imgUrl);const buffer = await imgResponse.buffer();const options = {method: "POST",headers: headerParameters,body: buffer};// Calling endpoint URL for object detectionconst response = await fetch(endpointUrl, options);printResponse(response);} catch (error) {printError(error);}}// URL of the imagelet imgUrl = "https://images.unsplash.com/photo-1599889959407-598566c6e1f1?ixlib=rb-4.0.3&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=500&q=40";// Rendering html to show imageconsole.log(`<img src=${imgUrl} width="400px" height="500px">`);detectObjects(imgUrl);