Home/Blog/Cloud Computing/Mastering Sentiment Analysis using Amazon Comprehend

Mastering Sentiment Analysis using Amazon Comprehend

9 min read

Mar 05, 2025

content

What is sentiment analysis?

How does sentiment analysis work?

Sentiment analysis approaches

What is Amazon Comprehend?

Sentiment analysis with Amazon Comprehend

The DetectSentiment operation:

The BatchDetectSentiment operation:

The StartSentimentDetectionJob operation:

Real-time sentiment analysis

Example: Sentiment analysis on customer reviews

Case study: Learn how Zillow has built speech analytics infrastructure

Conclusion and next steps

Key takeaways:
Amazon Comprehend is a fully managed natural language processing (NLP) service AWS provides. It uses machine learning (ML) and deep learning techniques to analyze and understand large volumes of text data.
Amazon Comprehend can detect sentiment in text, whether in a single document, a batch of documents, or a large-scale asynchronous job.
We can perform real-time sentiment analysis by integrating Amazon Comprehend with other AWS services.

Do you know what your customers think about your product? Extracting meaningful insights can be challenging with their voices lost in mountains of unstructured text—tweets, reviews, and support emails. Businesses must turn this heap of text into actionable intelligence to make informed decisions. This is exactly where sentiment analysis can help.

What is sentiment analysis?#

Sentiment analysis is a natural language processing (NLP) technique for determining and extracting a text's emotional tone or sentiment. It helps businesses analyze written content—such as customer reviews and social media posts—and allows organizations to gauge how people feel about a topic, product, service, or brand.

For businesses, sentiment analysis, or opinion mining, is like a secret decoder ring for customer feedback. It helps them transform user feedback into actionable insights and enhance their products and services. For instance, as seen in the customer reviews above, a clothing store can assess the quality of its customer service or identify shortcomings in the premium product, highlighting areas for improvement.

Atlanta Hawkshttps://sproutsocial.com/insights/case-studies/atlanta-hawks/ utilized sentiment analysis to improve their marketing strategy. In three months, they boosted video views by 127% and audience growth to 170%.

How does sentiment analysis work?#

Typically, the process of sentiment analysis has two main steps:

Preprocessing: This process involves breaking down the sentences into tokens, converting words into their root form using lemmatization (e.g., the root form of “ran” is “run”), and removing the stop words.
Keyword analysis: This step involves analyzing the tokens and assigning a sentiment score. The sentiment or score is a relative measure of a text's positivity, negativity, or neutrality.

Sentiment analysis approaches#

There are three common approaches used to design a system for sentiment analysis.

Rule-based approach: This approach utilizes lexicons to classify every word and assign it a score. Lexicons are predefined dictionaries or lists of words and phrases associated with specific sentiments—typically positive, negative, or neutral. Each word in the lexicon is assigned a sentiment score or polarity, which quantifies its emotional tone. These scores are used to analyze text and determine its overall sentiment. This approach, though effective, is difficult to scale because lexicons require frequent updates as new terms emerge and industry-specific vocabulary expands.
ML approach: This approach trains neural networks to classify and assign scores. It is quite effective as long as the model is trained accurately.
Hybrid approach: It combines ML and rule-based approaches to improve the system's accuracy and speed up sentiment analysis.

Implementing accurate sentiment analysis at scale can be challenging due to the nuances of human language and the need for robust machine-learning models. To address these challenges, AWS offers powerful tools and services, like Amazon Comprehend, that simplify and enhance sentiment analysis.

Mastering Natural Language Processing

Mastering Natural Language Processing

Natural language processing (NLP) enables computers to understand, interpret, and generate human language meaningfully, contextually, and relevantly. NLP applications, including virtual assistants like Siri and Alexa, language translation services, and more, are widespread. The field continues to evolve with ongoing research and technological advancements, making it a highly valued skill amongst machine learning engineers in the tech industry. The Skill Path begins with a comprehensive introduction to the fundamental concepts of natural language processing (NLP) and machine learning. Next, you’ll extensively cover spaCy’s (a widely used Python library for machine learning) architecture and gain hands-on experience using spaCy for real-world NLP applications. Finally, you’ll use these skills to build some applications using NLP.

29hrs

Beginner

35 Challenges

24 Quizzes

What is Amazon Comprehend?#

Amazon Comprehend is a fully managed natural language processing (NLP) service that Amazon Web Services (AWS) provides. It uses machine learning (ML) and deep learning techniques to help businesses extract insights from large volumes of text data. It is typically used to perform the following tasks:

Entity recognition: It is a technique for identifying and classifying specific entities in text into predefined categories. These entities typically include names of people, organizations, locations, dates, percentages, monetary values, and more. Amazon Comprehend also supports custom entity recognition, allowing businesses to detect domain-specific terms like product names, brands, or social security numbers (SSNs).
Topic modeling: An unsupervised machine learning technique is used to identify the hidden topics or themes in a collection of documents. It helps organize, summarize, and understand large amounts of text data by clustering similar words and phrases.

Language detection: AWS Comprehend makes it easy to detect the language of a given text by using its detect_dominant_language function (via an API call).
Sentiment analysis: Used to identify the general sentiment in a text, such as positive, negative, or neutral. We can also use Comprehend for targeted sentiment analysis to understand sentiments toward certain entities more granularly. For example, what do people think about the latest iPhone design or the new Marvel movie released this holiday season?
PII identification and redaction: Personally identifiable information refers to any information that can be used to identify an individual. Amazon Comprehend helps detect and remove PII from datasets and text.
Toxicity detection: Comprehend can detect toxicity in text-based documents using simple NLP-based techniques.

The flags are used to specify the following:

--input-data-config: Specifies the input S3 URI and input format (ONE_DOC_PER_LINEIt is a data formatting convention used in NLP to specify that each line in a file represents one document or data sample. or ONE_DOC_PER_FILEIt is a data storage and formatting convention, which specifies that each document or data sample is stored within its own file. ).
--output-data-config: This is the S3 URI where results will be saved.
--data-access-role-arn: This is an IAM roleAn IAM (Identity and Access Management) role is a set of permissions defining actions allowed or denied on AWS resources. A role can be assumed by any user, user group, or AWS resource. with permissions to access S3 and run Comprehend jobs.
--language-code: This is the language of the input data (e.g., “en”).
--job-name: This is a unique name for the job.

If successfully executed, the output of the command is as follows:

These are just the ways to perform sentiment analysis on data in offline or batch processing mode.

Traditionally, businesses have taken the “store now, analyze later” approach to customer feedback. They build architectures to collect and stash feedback in a data warehouse, leaving the heavy lifting of analysis to data scientists.

This approach is also utilized for sentiment analysis, but it’s slow. Data can take days to flow through the pipeline to the data warehouse and wait for its turn to be analyzed in batches. When those insights about customer sentiment make their way into a report, hours or even days have passed.

In today’s fast-moving world, delayed insights might as well be no insights. Businesses need real-time answers, not yesterday’s news.

Real-time sentiment analysis#

Companies can gain a competitive edge in a fast-paced business environment by analyzing real-time customer feedback. For instance, customer service teams can quickly identify dissatisfaction and take immediate action, improving customer experience and strengthening brand loyalty.

Amazon Comprehend enables this by seamlessly integrating into business workflows to extract sentiment insights from text. Connecting with services like Amazon Kinesis for real-time data streaming and AWS Lambda for serverless processing enables instant analysis without complex infrastructure.

Example: Sentiment analysis on customer reviews#

Consider a restaurant that uses the Comprehend API to analyze user reviews as they come in, as illustrated below:

Users post their sentiments on the restaurant’s mobile application, which is captured in real-time by Amazon Kinesis data streams. Adding kinesis data streams triggers the Lambda function, which performs sentiment analysis on review and pushes it to an S3 bucket.

Beyond analyzing overall sentiment, we can use targeted sentiment analysis to detect customer opinions on specific menu items or branch locations.

For example, a customer might leave a review such as:

“I enjoyed the Penne Arrabbiata—it was flavorful. However. The Chicken Kiev was bland.”

Or maybe a customer could leave a review like;

“My food at the Seattle restaurant was really good, but the service was slow.”

Here, targeted sentiment analysis can help us identify in real time whether Chicken Kiev has been bland lately or whether the service at the Seattle restaurant experienced service issues last night.

This architecture can be further improved by integrating a few services. For example, by integrating SNS and SQS services, we can immediately alert management about slow service reviews.

We can also add an Amazon QuickSight dashboard to provide visualizations and color-coded trends, enhancing your analysis.

Case study: Learn how Zillow has built speech analytics infrastructure #

Zillow is a leading online real estate marketplace in the United States, offering services for buying, selling, renting, and financing homes. They wanted to build an application to analyze customer sentiment in support calls.

Zillow effectively utilized Amazon Transcribe and Amazon Comprehend to convert audio files into text and perform sentiment analysis to achieve this. The diagram below illustrates their infrastructure:

Let’s briefly overview how everything comes together to make the application work:

S3 bucket: Recorded phone calls are stored in an S3 bucket, which triggers a Step function using an Amazon EventBridge rule.
Step Functions: Orchestrates the workflow by invoking two key services:
- Amazon Transcribe: Converts audio into text and removes PII (Personally Identifiable Information) for customer privacy.
- Amazon Comprehend: Performs sentiment analysis on the transcribed text and stores the results in an S3 bucket
Amazon S3 (intermediate storage): Stores sentiment analysis results, enriching them with call metadata (e.g., location, service type)
ElasticSearch: Stores the enriched data, enabling internal teams to search using features like phone numbers.

This workflow demonstrates how AWS services can be combined to build end-to-end speech analytics solutions. Whether using legacy monolithic applications or serverless architectures, you can easily integrate sentiment analysis into your systems by invoking the Amazon Comprehend API in just a few steps.

Conclusion and next steps#

Sentiment analysis is a transformative tool for businesses, allowing them to extract actionable insights from unstructured text data. By leveraging techniques like natural language processing and tools like Amazon Comprehend, organizations can effectively analyze customer feedback, social media posts, and reviews to gauge public sentiment. This enables better decision-making, improved customer experiences, and the development of more targeted strategies, making sentiment analysis an essential component of modern business intelligence.

Ready to see sentiment analysis in action? Try the Educative Cloud Lab and start building your sentiment analysis pipeline today!

Frequently Asked Questions

What kind of text can Amazon Comprehend analyze for sentiment?

Amazon Comprehend can analyze text data, including customer reviews, social media posts, emails, tweets, surveys, and other forms of unstructured text. It works with various languages, such as English, Spanish, German, and French.

How accurate is sentiment analysis with Amazon Comprehend?

Amazon Comprehend uses state-of-the-art deep learning models that provide highly accurate sentiment analysis, especially for commonly used languages and popular domains.

Can I customize sentiment analysis for my specific needs?

While Amazon Comprehend provides pretrained models for sentiment analysis, we can customize it using Comprehend Custom to train models specific to our use case, such as analyzing industry-specific language or domain-specific sentiment.

Can we do a sentimental analysis on audio files in AWS?

You can perform sentiment analysis on audio files in AWS by combining Amazon Transcribe and Amazon Comprehend. First, use Amazon Transcribe to convert the audio into text, then analyze the transcribed text with Amazon Comprehend to detect sentiment.

Written By:

Zainab Mohsin