Using Apache Iceberg on AWS - AWS Prescriptive Guidance

Using Apache Iceberg on AWS

Amazon Web Services (contributors)

April 2024 (document history)

Apache Iceberg is an open-source table format that simplifies table management while improving performance. AWS analytics services such as Amazon EMR, AWS Glue, Amazon Athena, and Amazon Redshift include native support for Apache Iceberg, so you can easily build transactional data lakes on top of Amazon Simple Storage Service (Amazon S3) on AWS.

This technical guide provides guidance on getting started with Apache Iceberg on different AWS services, and includes best practices and recommendations for running Apache Iceberg on AWS at scale while optimizing cost and performance.

This guide applies to anyone who is using Apache Iceberg on AWS, from novice users who want to swiftly get started with Apache Iceberg to advanced users who want to optimize and tune their existing Apache Iceberg workloads on AWS.

In this guide: