Amazon Athena is a SQL query service for data stored in Amazon S3. Launched in 2016, Athena is based on Presto, an open-source SQL query engine.

Athena doesn’t require loading data outside of S3, though there is some schema setup required to properly query the data (similar to Redshift Spectrum).

Using Amazon Athena with AWS Glue

One of the faster ways to use Amazon Athena is through its integration with AWS Glue. Specifically, Athena can query databases and tables that have schemas (metadata definitions) stored in the AWS Glue Data Catalog.

Note: If you already have tables in the AWS Glue Data Catalog, you can jump ahead to the section Opening Athena from AWS Glue.

Setting up AWS Glue in our S3 account

Below is our dwarf_activities.csv example file that we’ll upload to our S3 account.

Get hands-on with 1200+ tech skills courses.