Considerations and limitations for SQL queries in Amazon Athena - Amazon Athena

Considerations and limitations for SQL queries in Amazon Athena

When running queries in Athena, keep in mind the following considerations and limitations:

  • Stored procedures – Stored procedures are not supported.

  • Parameterized queries – Parameterized queries are not supported in Athena engine version 1 but are supported in Athena engine version 2. For more information, see Querying with prepared statements.

  • Maximum number of partitions – The maximum number of partitions you can create with CREATE TABLE AS SELECT (CTAS) statements is 100. For information, see CREATE TABLE AS. For a workaround, see Using CTAS and INSERT INTO to create a table with more than 100 partitions.

  • Unsupported statements – The following statements are not supported:

    • CREATE TABLE LIKE is not supported.

    • DESCRIBE INPUT and DESCRIBE OUTPUT is not supported.

    • EXECUTE … USING is not supported.

    • MERGE statements are not supported.

    • UPDATE statements are not supported.

  • Presto federated connectorsPresto federated connectors are not supported. Use Amazon Athena Federated Query to connect data sources. For more information, see Using Amazon Athena Federated Query.

  • Querying parquet columns with complex data types – When you query columns with complex data types (array, map, struct), and are using Parquet for storing data, Athena engine version 1 reads an entire row of data instead of selectively reading only the specified columns. This issue does not occur in Athena engine version 2.

  • Timeouts on tables with many partitions – Athena may time out when querying a table that has many thousands of partitions. This can happen when the table has many partitions that are not of type string. When you use type string, Athena prunes partitions at the metastore level. However, when you use other data types, Athena prunes partitions on the server side. The more partitions you have, the longer this process takes and the more likely your queries are to time out. To resolve this issue, set your partition type to string so that Athena prunes partitions at the metastore level. This reduces overhead and prevents queries from timing out.

  • Amazon S3 Glacier storage – Athena does not support querying the data in the S3 Glacier flexible retrieval or S3 Glacier Deep Archive storage classes, or in the Archive access or deep archive access tiers of the S3 Intelligent Tiering storage class. Objects in the S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive storage classes are ignored.

    Data that is moved or transitioned to one of these classes are no longer readable or queryable by Athena even after storage class objects are restored. To make the restored objects that you want to query readable by Athena, copy the restored objects back into Amazon S3 to change their storage class. Alternatively, you can use the Amazon S3 Glacier Instant Retrieval storage class, which is queryable by Athena. For more information, see Amazon S3 Glacier instant retrieval storage class.

  • Files treated as hidden – Athena treats source files that start with an underscore (_) or a dot (.) as hidden. To work around this limitation, rename the files.

  • Row or column size limitation – The size of a single row or its columns cannot exceed 32 megabytes. This limit can be exceeded when, for example, a row in a CSV or JSON file contains a single column of 100 megabytes. Exceeding this limit can also produce the error message Line too long in text file. To work around this limitation, make sure that the sum of the data of the columns in any row is less than 32MB.