Amazon Athena
User Guide

Document History

This documentation is associated with the May, 18, 2017 version of Amazon Athena.

Latest documentation update: March 5, 2019.

Change Description Release Date
Released the new version of the ODBC driver with support for Athena workgroups.

To download the ODBC driver version 1.0.5 and its documentation, see Connecting to Amazon Athena with ODBC. For information about this version, see the ODBC Driver Release Notes.

For more information, search for "workgroup" in the ODBC Driver Installation and Configuration Guide version 1.0.5. There are no changes to the ODBC driver connection string when you use tags on workgroups. To use tags, upgrade to the latest version of the ODBC driver, which is this current version.

This driver version lets you use Athena API workgroup actions to create and manage workgroups, and Athena API tag actions to add, list, or remove tags on workgroups. Before you begin, make sure that you have resource-level permissions in IAM for actions on workgroups and tags.

March 5, 2019
Added tag support for workgroups in Amazon Athena.

A tag consists of a key and a value, both of which you define. When you tag a workgroup, you assign custom metadata to it. You can add tags to workgroups to help categorize them, using AWS tagging best practices. You can use tags to restrict access to workgroups, and to track costs. For example, create a workgroup for each cost center. Then, by adding tags to these workgroups, you can track your Athena spending for each cost center. For more information, see Using Tags for Billing in the AWS Billing and Cost Management User Guide.

February 22, 2019
Improved the JSON OpenX SerDe used in Athena.

The improvements include, but are not limited to, the following:

  • Support for the ConvertDotsInJsonKeysToUnderscores property. When set to TRUE, it allows the SerDe to replace the dots in key names with underscores. For example, if the JSON dataset contains a key with the name "a.b", you can use this property to define the column name to be "a_b" in Athena. The default is FALSE. By default, Athena does not allow dots in column names.

  • Support for the case.insensitive property. By default, Athena requires that all keys in your JSON dataset use lowercase. Using WITH SERDE PROPERTIES ("case.insensitive"= FALSE;) allows you to use case-sensitive key names in your data. The default is TRUE. When set to TRUE, the SerDe converts all uppercase columns to lowercase.

For more information, see OpenX JSON SerDe.

February 18, 2019
Added support for workgroups.

Use workgroups to separate users, teams, applications, or workloads, and to set limits on amount of data each query or the entire workgroup can process. Because workgroups act as IAM resources, you can use resource-level permissions to control access to a specific workgroup. You can also view query-related metrics in Amazon CloudWatch, control query costs by configuring limits on the amount of data scanned, create thresholds, and trigger actions, such as Amazon SNS alarms, when these thresholds are breached. For more information, see Using Workgroups for Running Queries and Controlling Costs and Monitoring Queries with CloudWatch Metrics.

February 18, 2019
Added support for analyzing logs from Network Load Balancer.

Added example Athena queries for analyzing logs from Network Load Balancer. These logs receive detailed information about the Transport Layer Security (TLS) requests sent to the Network Load Balancer. You can use these access logs to analyze traffic patterns and troubleshoot issues. For information, see Querying Network Load Balancer Logs.

January 24, 2019

Released the new versions of the JDBC and ODBC driver with support for federated access to Athena API with the AD FS and SAML 2.0 (Security Assertion Markup Language 2.0).

With this release of the drivers, federated access to Athena is supported for the Active Directory Federation Service (AD FS 3.0). Access is established through the versions of JDBC or ODBC drivers that support SAML 2.0. For information about configuring federated access to the Athena API, see Enabling Federated Access to the Athena API.

November 10, 2018

Added support for fine-grained access control to databases and tables in Athena. Additionally, added policies in Athena that allow you to encrypt database and table metadata in the Data Catalog.

Added support for creating identity-based (IAM) policies that provide fine-grained access control to resources in the AWS Glue Data Catalog, such as databases and tables used in Athena.

Additionally, you can encrypt database and table metadata in the Data Catalog, by adding specific policies to Athena.

For details, see Access Control Policies.

October 15, 2018
Added support for CREATE TABLE AS SELECT statements.

Made other improvements in the documentation.

Added support for CREATE TABLE AS SELECT statements. See Creating a Table from Query Results, Considerations and Limitations, and Examples.

October 10, 2018

Released the ODBC driver version 1.0.3 with support for streaming results instead of fetching them in pages.

Made other improvements in the documentation.

The ODBC driver version 1.0.3 supports streaming results and also includes improvements, bug fixes, and an updated documentation for "Using SSL with a Proxy Server". For details, see the Release Notes for the driver.

For downloading the ODBC driver version 1.0.3 and its documentation, see Connecting to Amazon Athena with ODBC.

September 6, 2018

Released the JDBC driver version 2.0.5 with default support for streaming results instead of fetching them in pages.

Made other improvements in the documentation.

Released the JDBC driver 2.0.5 with default support for streaming results instead of fetching them in pages. For information, see Using Athena with the JDBC Driver.

For information about streaming results, search for UseResultsetStreaming in the JDBC Driver Installation and Configuration Guide.

August 16, 2018

Updated the documentation for querying Amazon Virtual Private Cloud flow logs, which can be stored directly in Amazon S3 in a GZIP format.

Updated examples for querying ALB logs.

Updated the documentation for querying Amazon Virtual Private Cloud flow logs, which can be stored directly in Amazon S3 in a GZIP format. For information, see Querying Amazon VPC Flow Logs.

Updated examples for querying ALB logs. For information, see Querying Application Load Balancer Logs.

August 7, 2018
Added support for views. Added guidelines for schema manipulations for various data storage formats.

Added support for views. For information, see Views.

Updated this guide with guidance on handling schema updates for various data storage formats. For information, see Handling Schema Updates.

June 5, 2018
Increased default query concurrency limits from five to twenty.

You can submit and run up to twenty DDL queries and twenty SELECT queries at a time. For information, see Service Limits.

May 17, 2018
Added query tabs, and an ability to configure auto-complete in the Query Editor.

Added query tabs, and an ability to configure auto-complete in the Query Editor. For information, see Using the Console.

May 8, 2018
Released the JDBC driver version 2.0.2.

Released the new version of the JDBC driver (version 2.0.2). For information, see Using Athena with the JDBC Driver.

April 19, 2018

Added auto-complete for typing queries in the Athena console.

Added auto-complete for typing queries in the Athena console.

April 6, 2018

Added an ability to create Athena tables for CloudTrail log files directly from the CloudTrail console.

Added an ability to automatically create Athena tables for CloudTrail log files directly from the CloudTrail console. For information, see Creating a Table for CloudTrail Logs in the CloudTrail Console.

March 15, 2018
Added support for securely offloading intermediate data to disk for queries with GROUP BY. Added an ability to securely offload intermediate data to disk for memory-intensive queries that use the GROUP BY clause. This improves the reliability of such queries, preventing "Query resource exhausted" errors. For more information, see the release note for February 2, 2018. February 2, 2018
Added support for Presto version 0.172. Upgraded the underlying engine in Amazon Athena to a version based on Presto version 0.172. For more information, see the release note for January 19, 2018. January 19, 2018
Added support for the ODBC Driver. Added support for connecting Athena to the ODBC Driver. For information, see Connecting to Amazon Athena with ODBC. November 13, 2017
Added support for Asia Pacific (Seoul), Asia Pacific (Mumbai), and EU (London) regions. Added support for querying geospatial data. Added support for querying geospatial data, and for Asia Pacific (Seoul), Asia Pacific (Mumbai), EU (London) regions. For information, see Querying Geospatial Data and AWS Regions and Endpoints. November 1, 2017
Added support for EU (Frankfurt). Added support for EU (Frankfurt). For a list of supported regions, see AWS Regions and Endpoints. October 19, 2017
Added support for named Athena queries with AWS CloudFormation. Added support for creating named Athena queries with AWS CloudFormation. For more information, see AWS::Athena::NamedQuery in the AWS CloudFormation User Guide. October 3, 2017
Added support for Asia Pacific (Sydney). Added support for Asia Pacific (Sydney). For a list of supported regions, see AWS Regions and Endpoints. September 25, 2017
Added a section to this guide for querying AWS Service logs and different types of data, including maps, arrays, nested data, and data containing JSON. Added examples for Querying AWS Service Logs and for querying different types of data in Athena. For information, see Querying Data in Amazon Athena Tables. September 5, 2017
Added support for AWS Glue Data Catalog. Added integration with the AWS Glue Data Catalog and a migration wizard for updating from the Athena managed data catalog to the AWS Glue Data Catalog. For more information, see Integration with AWS Glue and AWS Glue. August 14, 2017
Added support for Grok SerDe. Added support for Grok SerDe, which provides easier pattern matching for records in unstructured text files such as logs. For more information, see Grok SerDe. Added keyboard shortcuts to scroll through query history using the console (CTRL + ⇧/⇩ using Windows, CMD + ⇧/⇩ using Mac). August 4, 2017
Added support for Asia Pacific (Tokyo). Added support for Asia Pacific (Tokyo) and Asia Pacific (Singapore). For a list of supported regions, see AWS Regions and Endpoints. June 22, 2017
Added support for EU (Ireland). Added support for EU (Ireland). For more information, see AWS Regions and Endpoints. June 8, 2017
Added an Amazon Athena API and AWS CLI support. Added an Amazon Athena API and AWS CLI support for Athena. Updated JDBC driver to version 1.1.0. May 19, 2017
Added support for Amazon S3 data encryption. Added support for Amazon S3 data encryption and released a JDBC driver update (version 1.0.1) with encryption support, improvements, and bug fixes. For more information, see Encryption at Rest. April 4, 2017
Added the AWS CloudTrail SerDe. Added the AWS CloudTrail SerDe, improved performance, fixed partition issues. For more information, see CloudTrail SerDe.
  • Improved performance when scanning a large number of partitions.

  • Improved performance on MSCK Repair Table operation.

  • Added ability to query Amazon S3 data stored in regions other than your primary region. Standard inter-region data transfer rates for Amazon S3 apply in addition to standard Athena charges.

March 24, 2017
Added support for US East (Ohio). Added support for Avro SerDe and OpenCSVSerDe for Processing CSV, US East (Ohio), and bulk editing columns in the console wizard. Improved performance on large Parquet tables. February 20, 2017
The initial release of the Amazon Athena User Guide. November, 2016