Migrating existing tables to Iceberg - AWS Prescriptive Guidance

Migrating existing tables to Iceberg

This section focuses on migrating your existing Hive-style tables to Iceberg format. It applies to tables that use traditional Hive-compatible formats such as Apache Parquet or Apache ORC. This information doesn't apply to tables that already use modern table formats such as Linux Foundation Delta Lake or Apache Hudi.

To migrate your current Hive-style tables to Iceberg format, you can use either in-place or full data migration: 

  • In-place migration is the process of generating Iceberg's metadata files on top of existing data files.

  • Full data migration creates the Iceberg metadata layer and also rewrites existing data files from the original table to the new Iceberg table.

The following sections provide a detailed overview of each migration method, including step-by-step instructions and considerations for implementation. For more information about these migration strategies, see the Table Migration section of the Iceberg documentation.

After you review the details of the in-place and full data migration methods, see the following two key sections to aid your decision-making process:

  • Choosing a migration strategy provides guidance through a series of questions and scenarios, to help you determine the most suitable migration approach based on your specific requirements and use cases.

  • Migration options summary provides a comprehensive table that compares key characteristics and considerations across different migration options. This table serves as a quick reference guide and offers a feature comparison to help you understand the technical trade-offs between methods.