Building a Document Processing Pipeline with AWS Services

Building a Document Processing Pipeline with AWS Services
Building a Document Processing Pipeline with AWS Services

CLOUD LABS

Building a Document Processing Pipeline with AWS Services

Learn how to use Amazon’s ML services for document processing. We’ll learn to use multiple AWS services to automate the document processing cycle.

9 Tasks

beginner

1hr

Certificate of Completion

Desktop OnlyDevice is not compatible.
No Setup Required
Amazon Web Services

Learning Objectives

A familiarity with Amazon S3 and the ability to store and retrieve data using S3
The ability to use the IAM service to provide permissions to other services using IAM roles
Hands-on experience in creating a Lambda function to execute a piece of code
The ability to create a sender identity for SES and send emails using it
Hands-on experience in automating data analysis using S3, and AWS Textract and Comprehend

Technologies
Lambda logoLambda
CloudWatch logoCloudWatch
IAM logoIAM
Textract logoTextract
Comprehend logoComprehend
S3 logoS3
Skills Covered
Using AWS Cloud Services
Natural Language Processing
Data Pipeline Engineering
Cloud Lab Overview

The traditional way to analyze documents and extract insights was through manual processing. It used to be a time-consuming process with a high probability of errors. Using AI, we can automate this process, making it much faster and more accurate. To help us do that, Amazon provides AI tools such as Textract and Comprehend. Textract can help us extract data from images and documents. The extracted data is in the form of text. This textual data can then be fed to Comprehend, an NLP tool that analyzes textual data. In response, we’ll get the necessary insights.

In this Cloud Lab, you’ll learn to automate document processing using multiple Amazon services.

To do that, you’ll first create an S3 bucket where the input and output data will be stored. After that, you’ll create an IAM role to provide necessary permissions to other AWS services. You’ll then create a Lambda function to execute a piece of code that will feed the data stored in the bucket to Textract to convert it to text. This text will then be processed using Comprehend, and the output of Comprehend will be stored in the output folder of this bucket. Finally, you’ll integrate an email service in the pipeline using Amazon SES.

After completing this Cloud Lab, you’ll have a pipeline for extracting and processing text from documents using AWS services. Completing these tasks will equip you with practical knowledge of how to utilize these AWS services to automate document processing tasks.

Architecture diagram
Architecture diagram
Cloud Lab Tasks
1.Introduction
Getting Started
2.Create the Required Resources
Create an S3 Bucket
Create an Execution Role
Create a Lambda Function
Configure the Lambda Function
3.Text Extraction and Analysis
Test the Document Processing Pipeline
Integrate Amazon Simple Email Service (SES)
4.Conclusion
Clean Up
Wrap Up
Labs Rules Apply
Stay within resource usage requirements.
Do not engage in cryptocurrency mining.
Do not engage in or encourage activity that is illegal.

Relevant Courses

Use the following content to review prerequisites or explore specific concepts in detail.

Trusted by 1.4 million developers working at companies including

Don’t take our word for it. See what our developers have to say.

Your method is simple, straight to the point and I can practice with it everywhere, even from my phone, that's something I have never had in other learning platforms.

Felipe Matheus
TestimonialsImg

I highly recommend Educative. The courses are well organized and easy to understand.

Adina Ong
TestimonialsImg

I prefer Educative courses because they have a nice mix of text & images. I find that with full video courses, it can often be too easy to go into passive learning mode.

Clifford Fajardo
TestimonialsImg

I prefer Educative courses because they have a nice mix of text & images. I find that with full video courses, it can often be too easy to go into passive learning mode.

Clifford Fajardo
TestimonialsImg
Don’t take our word for it. See what our developers have to say.

Your method is simple, straight to the point and I can practice with it everywhere, even from my phone, that's something I have never had in other learning platforms.

Felipe Matheus
TestimonialsImg

I highly recommend Educative. The courses are well organized and easy to understand.

Adina Ong
TestimonialsImg

I prefer Educative courses because they have a nice mix of text & images. I find that with full video courses, it can often be too easy to go into passive learning mode.

Clifford Fajardo
TestimonialsImg

Get access to Educative Cloud Labs

Course Footer Image
Course Footer Image