Describe an image

An exciting feature of Microsoft Computer Vision is that it can describe an entire image in human-readable language using complete sentences. The algorithm works like this:

It generates various descriptions based on the objects identified in the image.
It evaluates and assigns a confidence score to each description.
Finally, it returns a list of descriptions in descending order of confidence score.

To get the description of an image, we use the describe_image method. Let’s use it on our reference image.

Note: The image URL is provided at line 5. Feel free to change it and test the API on other images.

Press + to interact

Python 3.5

# Authenticate the client
computervision_client = ComputerVisionClient("{{ENDPOINT}}", CognitiveServicesCredentials("{{SUBSCRIPTION_KEY}}"))
# Provide image URL
remote_image_url = "https://images.pexels.com/photos/356065/pexels-photo-356065.jpeg?auto=compress&cs=tinysrgb&dpr=2&h=750&w=1260"
# Call API
description_results = computervision_client.describe_image(remote_image_url)
# Get the captions (descriptions) from the response, with confidence level
print("Description of remote image: ")
if (len(description_results.captions) == 0):
    print("No description detected.")
else:
    for caption in description_results.captions:
        print("'{}' with confidence {:.2f}%".format(caption.text, caption.confidence * 100))

Apply content tags to images

Microsoft Computer Vision returns tags based on the objects, living beings, and actions identified in the image. Tagging is not limited to the main subject, such as a person in the foreground, but includes background details like the setting (indoor or outdoor), furniture, tools, plants, animals, accessories, and gadgets.

Tags are not organized as taxonomy, and no inheritance hierarchies exist. Content tags form the foundation for an image description. They are used to create human-understandable descriptions when we call the describe_image method.

Note: At this point, English is the only supported language for the image description feature.

To get the tags of an image, we use the tag_image method. Let’s use it on our reference image.

Note: The image URL is provided at line 5. Feel free to change it and test the API on other images.

Press + to interact

Python 3.5

# Authenticate the client
computervision_client = ComputerVisionClient("{{ENDPOINT}}", CognitiveServicesCredentials("{{SUBSCRIPTION_KEY}}"))
# Provide image URL
remote_image_url = "https://images.pexels.com/photos/356065/pexels-photo-356065.jpeg?auto=compress&cs=tinysrgb&dpr=2&h=750&w=1260"
# Call API with remote image
tags_result_remote = computervision_client.tag_image(remote_image_url)
# Print results with confidence score
print("Tags in the remote image: ")
if (len(tags_result_remote.tags) == 0):
    print("No tags detected.")
else:
    for tag in tags_result_remote.tags:
        print("'{}' with confidence {:.2f}%".format(tag.name, tag.confidence * 100))

Press + to interact

Python 3.5

# Authenticate the client
computervision_client = ComputerVisionClient("{{ENDPOINT}}", CognitiveServicesCredentials("{{SUBSCRIPTION_KEY}}"))
# Provide image URL
remote_image_url = "https://images.pexels.com/photos/356065/pexels-photo-356065.jpeg?auto=compress&cs=tinysrgb&dpr=2&h=750&w=1260"
# Select the visual feature(s) you want.
remote_image_features = ["categories"]
# Call API with URL and features
categorize_results_remote = computervision_client.analyze_image(remote_image_url , remote_image_features)
# Print results with confidence score
print("Categories from remote image: ")
if (len(categorize_results_remote.categories) == 0):
    print("No categories detected.")
else:
    for category in categorize_results_remote.categories:
        print("'{}' with confidence {:.2f}%".format(category.name, category.score * 100))

Get Started

Optical Character Recognition

Image Analysis

Sample Application

Conclusion

Image Description, Category, and Tag

Describe an image

Apply content tags to images

Categorize images by subject matter