Migration options summary
This table summarizes the main characteristics and considerations for each migration option.
Feature |
In-place migration |
In-place migration |
Full data migration |
---|---|---|---|
Data layout improvements as part of the migration process |
|||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Supported file formats |
Parquet, Avro, ORC |
Parquet, Avro, ORC |
Parquet, Avro, ORC, JSON, CSV |
Source table replacement by an Iceberg table |
(creates a new table, but with additional steps you can replace the source table) |
(creates a backup table and substitutes the source table with an Iceberg table) |
(creates a new table) |
Source table impact |
|||
|
Corrupts source table |
Corrupts backup table |
Safe, source unaffected |
Iceberg table impact |
|||
|
Corrupts Iceberg table |
Corrupts Iceberg table |
No impact on Iceberg table |
|
Not visible on new table (need to incorporate partition
with |
Not visible on new table (need to incorporate partition
with |
Not visible on new table (need to |
Cost |
Low |
Low |
Higher (full data rewrite) |
Migration speed |
Fast |
Fast |
Slower |
Can be used to migrate to Amazon S3 Tables |
|
|
|
Requires manual DDL |
(schema and partitions are copied from source table) |
(schema and partitions are copied from source table) |
If using CTAS, requires only specifying the partitioning |
Best use |
Quick migration without rewriting data, allowing side-by-side use of Hive and Iceberg for testing or gradual transition. |
Replacing a Hive table in place without rewriting data, when an immediate switchover is acceptable. |
Full Iceberg optimization with data rewrite. Ideal when redesigning partitions or schema, or improving layout and performance. Always recommended if possible. |