Data Analytics on AWS: An Architectural Guide/

...

Amazon Athena

Understand the basics of Amazon Athena and how to integrate it with AWS Glue to query data in Amazon S3.

We'll cover the following...

Using Amazon Athena with AWS Glue
Other ways of using Amazon Athena

Amazon Athena is a SQL query service for data stored in Amazon S3. Launched in 2016, Athena is based on Presto, an open-source SQL query engine.

Athena doesn’t require loading data outside of S3, though there is some schema setup required to properly query the data (similar to Redshift Spectrum).

Using Amazon Athena with AWS Glue

One of the faster ways to use Amazon Athena is through its integration with AWS Glue. Specifically, Athena can query databases and tables that have schemas (metadata definitions) stored in the AWS Glue Data Catalog.

Note: If you already have tables in the AWS Glue Data Catalog, you can jump ahead to the section Opening Athena from AWS Glue.

Setting up AWS Glue in our S3 account

Below is our dwarf_activities.csv example file that we’ll upload to our S3 account.

Press + to interact

Overview

Data Sources

Data Ingestion

Scalable Data Lake

Unified Governance

Seamless Data Movement

Purpose-Built Analytics and Insights

Wrap Up

Scalable Machine Learning Model for Accurate Predictions on AWS

Amazon Athena

Using Amazon Athena with AWS Glue

Setting up AWS Glue in our S3 account