Migrating to Amazon Redshift
If you decide to migrate from an existing data warehouse to Amazon Redshift, which migration strategy you should choose depends on several factors:
-
The size of the database and its tables and objects
-
Network bandwidth between the source server and AWS
-
Whether the migration and switchover to AWS will be done in one step, or a sequence of steps over time
-
The data change rate in the source system
-
Transformations during migration
-
The partner tool that you plan to use for migration and ETL
One-step migration
One-step migration is a good option for small databases that don’t
require continuous operation. Customers can extract existing
databases as comma separated value (CSV) files, or columnar format
like Parquet, then use services such as
AWS Snowball
Two-step migration
Two-step migration is commonly used for databases of any size:
-
Initial data migration — The data is extracted from the source database, preferably during non-peak usage to minimize the impact. The data is then migrated to Amazon Redshift by following the one-step migration approach described previously.
-
Changed data migration — Data that changed in the source database after the initial data migration is propagated to the destination before switchover. This step synchronizes the source and destination databases.
After all the changed data is migrated, you can validate the data in the destination database, perform necessary tests, and if all tests are passed, switch over to the Amazon Redshift data warehouse.
Wave-based migration
Large-scale MPP data warehouse migration presents a challenge in
terms of project complexity, and is riskier. Taking precautions to
break a complex migration project into multiple logical and
systematic waves can significantly reduce the complexity and risk.
Starting from a workload that covers a good number of data sources
and subject areas with medium complexity, then add more data
sources and subject areas in each subsequent wave. See
Develop
an application migration methodology to modernize your data
warehouse with Amazon Redshift