Use AWS Glue Data Catalog to connect to your data
Athena uses the AWS Glue Data Catalog to store metadata such as table and column names for your data stored in Amazon S3. This metadata information becomes the databases, tables, and views that you see in the Athena query editor.
When using Athena with the AWS Glue Data Catalog, you can use AWS Glue to create databases and tables (schema) to be queried in Athena, or you can use Athena to create schema and then use them in AWS Glue and related services.
To define schema information for AWS Glue, you can use a form in the Athena console, use the
query editor in Athena, or create an AWS Glue crawler in the AWS Glue console. AWS Glue crawlers
automatically infer database and table schema from your data in Amazon S3. Using a form offers
more customization. Writing your own CREATE TABLE
statements requires more
effort, but offers the most control. For more information, see CREATE TABLE.
Additional Resources
-
For more information about the AWS Glue Data Catalog, see Data Catalog and crawlers in AWS Glue in the AWS Glue Developer Guide.
-
For an illustrative article showing how to use AWS Glue and Athena to process XML data, see Process and analyze highly nested and large XML files using AWS Glue and Amazon Athena
in the AWS Big Data Blog. -
Separate charges apply to AWS Glue. For more information, see AWS Glue pricing
.
Topics
- Register and use data catalogs in Athena
- Register a Data Catalog from another account
- Control access to data catalogs with IAM policies
- Use a form in the Athena console to add an AWS Glue table
- Use a crawler to add a table
- Optimize queries with AWS Glue partition indexing and filtering
- Use the AWS CLI to recreate an AWS Glue database and its tables
- Create tables for ETL jobs
- Work with CSV data in AWS Glue
- Work with geospatial data in AWS Glue