Optimizing Iceberg tables

Focus mode

Optimizing Iceberg tables - AWS Glue

AWS Glue supports mutiple table optimization options to enhance the management and performance of Apache Iceberg tables used by the AWS analytical engines and ETL jobs. These optimizers provide efficient storage utilization, improved query performance, and effective data management. There are three types of table optimizers available in AWS Glue:

Compaction – Data compaction compacts small data files to reduce storage usage and improve read performance. Data files are merged and rewritten to remove obsolete data and consolidate fragmented data into larger, more efficient files. Compaction can be configured to run automatically or manually triggered as needed.
Snapshot retention – Snapshots are timestamped versions of an Iceberg table. Snapshot retention configurations allow customers to enforce how long to retain snapshots and how many snapshots to retain. Configuring a snapshot retention optimizer can help manage storage overhead by removing older, unnecessary snapshots and their associated underlying files.
Orphan file deletion – Orphan files are files that are no longer referenced by the Iceberg table metadata. These files can accumulate over time, especially after operations like table deletions or failed ETL jobs. Enabling orphan file deletion allows AWS Glue to periodically identify and remove these unnecessary files, freeing up storage.

You can enable or disable compaction, snapshot retention, and orphan file deletion optimizers for individual Iceberg tables in the Data Catalog using the AWS Glue console, AWS CLI, or AWS Glue API operations.

The following video demonstrates how to configure optimizers for Iceberg tables in the Data Catalog.

Topics

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Populating and managing transactional tables

Prerequisites

Select your cookie preferences

Customize cookie preferences

Essential

Performance

Functional

Advertising

Unable to save cookie preferences

Optimizing Iceberg tables

Topics

Related resources

Did this page help you?

Related resources

Next topic:

Previous topic:

Need help?