AWS Glue is a serverless data integration service that makes it easier to discover, prepare, move, and integrate data from multiple sources. It provides ETL (extract, transform, load) service, which is a process used in data engineering to extract data from various sources, transform it into a desired format, and load it into a target data store for analysis, reporting, and business intelligence. AWS Glue simplifies the ETL process, making it easier for businesses to prepare and transform their data for analytics.
In this Cloud Lab, you’ll create a DynamoDB table as source data. You’ll set up a database in AWS Glue with the DynamoDB table as its source. After that, you’ll use the AWS Glue crawler to fetch metadata from the DynamoDB table and into Data Catalog tables in the Glue database. You’ll then set up an ETL pipeline in AWS Glue and extract data from the Glue database, perform transformations on the data, and load the resulting data in the S3 bucket.
After the completion of this Cloud Lab, the provisioned infrastructure will be similar to the one given below: