VACUUM - Amazon Athena

VACUUM

The VACUUM statement performs table maintenance on Apache Iceberg tables by removing no longer needed data files.

Note

VACUUM is transactional and is supported only for Apache Iceberg tables in Athena engine version 3.

Synopsis

To remove data files no longer needed for an Iceberg table, use the following syntax.

VACUUM target_table

Running the VACUUM statement on Iceberg tables is recommended to remove data files that are no longer relevant and to reduce metadata size and storage consumption. Note that, because the VACUUM statement makes API calls to Amazon S3, charges apply for the associated requests to Amazon S3.

Warning

If you run a snapshot expiration operation, you can no longer time travel to expired snapshots.

VACUUM performs the following operations:

  • Removes snapshots that are older than the amount of time that is specified by the vacuum_max_snapshot_age_seconds table property. By default, this property is set to 432000 seconds (5 days).

  • Removes snapshots that are not within the period to be retained that are in excess of the number specified by the vacuum_min_snapshots_to_keep table property. The default is 1.

    You can specify these table properties in your CREATE TABLE statement. After the table has been created, you can use the ALTER TABLE SET PROPERTIES statement to update them.

  • Removes any metadata and data files that are unreachable as a result of the snapshot removal. You can configure the number of old metadata files to be retained by setting the vacuum_max_metadata_files_to_keep table property. The default value is 100.

  • Removes orphan files that are older than the time specified in the vacuum_max_snapshot_age_seconds table property. Orphan files are files in the table's data directory that are not part of the table state.

For more information about creating and managing Apache Iceberg tables in Athena, see Creating Iceberg tables and Managing Iceberg tables.