Using Iceberg tables - Amazon Athena

Using Iceberg tables

Athena supports read, time travel, write, and DDL queries for Apache Iceberg tables that use the Apache Parquet format for data and the AWS Glue catalog for their metastore.

Apache Iceberg is an open table format for very large analytic datasets. Iceberg manages large collections of files as tables, and it supports modern analytical data lake operations such as record-level insert, update, delete, and time travel queries. The Iceberg specification allows seamless table evolution such as schema and partition evolution, and its design is optimized for usage on Amazon S3. Iceberg also helps guarantee data correctness under concurrent write scenarios.

For more information about Apache Iceberg, see https://iceberg.apache.org/.

Considerations and limitations

Athena support for Iceberg tables has the following limitations:

  • Tables with AWS Glue catalog only – Only Iceberg tables created against the AWS Glue catalog based on specifications defined by the open source glue catalog implementation are supported from Athena.

  • Table locking support by AWS Glue only – Unlike the open source Glue catalog implementation, which supports plug-in custom locking, Athena supports AWS Glue optimistic locking only. Using Athena to modify an Iceberg table with any other lock implementation will cause potential data loss and break transactions.

  • Parquet files only – Currently, Athena supports Iceberg tables in Parquet file format only. ORC and AVRO are not supported.

  • Iceberg v2 tables – Athena only creates and operates on Iceberg v2 tables. For the difference between v1 and v2 tables, see Format version changes in the Apache Iceberg documentation.

  • Display of time types without time zone – The time and timestamp without time zone types are displayed in UTC. If the time zone is unspecified in a filter expression on a time column, UTC is used.

  • Timestamp related data precision – While Iceberg supports microsecond precision for the timestamp data type, Athena supports only millisecond precision for timestamps in both reads and writes. Athena only retains millisecond precision in time related columns for data that is rewritten during manual compaction operations.

  • Lake Formation – Integration with AWS Lake Formation is not supported.

  • Unsupported operations – The following Athena operations are not supported for Iceberg tables.

If you would like Athena to support a particular feature, send feedback to athena-feedback@amazon.com.