Use AWS Glue to connect Athena to your data in Amazon S3
Athena uses the AWS Glue Data Catalog to store metadata such as table and column names for your data stored in Amazon S3. This metadata information becomes the databases, tables, and views that you see in the Athena query editor.
When using Athena with the AWS Glue Data Catalog, you can use AWS Glue to create databases and tables (schema) to be queried in Athena, or you can use Athena to create schema and then use them in AWS Glue and related services.
To define schema information for AWS Glue, you can use a form in the Athena console, use the
query editor in Athena, or create an AWS Glue crawler in the AWS Glue console. AWS Glue crawlers
automatically infer database and table schema from your data in Amazon S3. Using a form offers
more customization. Writing your own CREATE TABLE
statements requires more
effort, but offers the most control. For more information, see CREATE TABLE.
Additional Resources
-
For more information about the AWS Glue Data Catalog, see Data Catalog and crawlers in AWS Glue in the AWS Glue Developer Guide.
-
For an illustrative article showing how to use AWS Glue and Athena to process XML data, see Process and analyze highly nested and large XML files using AWS Glue and Amazon Athena
in the AWS Big Data Blog. -
Separate charges apply to AWS Glue. For more information, see AWS Glue pricing
.
Topics
- Use a form in the Athena console to add an AWS Glue table
- Use a crawler to add a table
- Use the AWS CLI to recreate an AWS Glue database and its tables
- Optimize queries with AWS Glue partition indexing and filtering
- Register a Data Catalog from another account
- Work with CSV data in AWS Glue
- Work with geospatial data in AWS Glue
- Create tables for ETL jobs