Migration options summary

This table summarizes the main characteristics and considerations for each migration option.

Feature	In-place migration snapshot	In-place migration migrate	Full data migration CTAS or (CREATE TABLE + INSERT)
Data layout improvements as part of the migration process
Re-sort data	No	No	Yes
Change partitioning (for example, to use Iceberg hidden partitioning)	No	No	Yes
Change table schema	No	No	Yes
Optimize file size	No	No	Yes
Validate the schema of existing data before adding the data	No	No	Yes
Supported file formats	Parquet, Avro, ORC	Parquet, Avro, ORC	Parquet, Avro, ORC, JSON, CSV
Source table replacement by an Iceberg table	No (creates a new table, but with additional steps you can replace the source table)	Yes (creates a backup table and substitutes the source table with an Iceberg table)	No (creates a new table)
Source table impact
File deletion operations on Iceberg table (`expire_snapshot` operations, dropping a table with purge)	Corrupts source table	Corrupts backup table	Safe, source unaffected
Iceberg table impact
Impact if source table files are removed	Corrupts Iceberg table	Corrupts Iceberg table	No impact on Iceberg table
Impact if new files are added on source table location	Not visible on new table (need to incorporate partition with `add_files`)	Not visible on new table (need to incorporate partition with `add_files`)	Not visible on new table (need to `INSERT INTO` the new table)
Cost	Low	Low	Higher (full data rewrite)
Migration speed	Fast	Fast	Slower
Can be used to migrate to Amazon S3 Tables	No	No	Yes
Requires manual DDL	No (schema and partitions are copied from source table)	No (schema and partitions are copied from source table)	If using CTAS, requires only specifying the partitioning
Best use	Quick migration without rewriting data, allowing side-by-side use of Hive and Iceberg for testing or gradual transition.	Replacing a Hive table in place without rewriting data, when an immediate switchover is acceptable.	Full Iceberg optimization with data rewrite. Ideal when redesigning partitions or schema, or improving layout and performance. Always recommended if possible.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Choosing a migration strategy

Best practices for optimizing Iceberg workloads