Determining the migration approach
To decide on a migration approach, you use the analysis you performed on existing patterns in the previous phase. Your organization’s future data and analytics needs are equally important considerations. Traditional on-premises ETL tools deal with relational data models and structured data. If you have semi-structured and unstructured data to process, you can use AWS services such as AWS Glue or Amazon EMR for the migration. Other factors that can influence the migration approach include:
-
Whether you want to use a graphical interface (such as AWS Glue Studio) or a custom framework (such as Spark/Python libraries)
-
Whether you have secure access to on-premises sources and AWS targets
-
Skills and training required for the team
-
Audit and compliance requirements
You can select from three migration approaches: big bang, phased, and lift and shift. The following table compares these three approaches.
Approach | Description | Use case | Advantages and disadvantages |
---|---|---|---|
Big bang | Migrate all SSIS packages within a specific time period. |
|
|
Phased | Identify one SSIS package for each distinct pattern and complexity. Migrate the package to AWS, test, and compare results with existing architecture. |
|
|
Lift and shift | Migrate the current architecture as is to AWS. |
|
|
A comparison of data on the source and target systems is fundamental for a successful migration. Because the existing production system gets regular updates from source systems, this comparison might become confusing. For this reason, when you’re determining your migration approach, we recommend that you also decide on your data validation strategy.
-
Take backups of all applicable databases and files from the production environment on the source system at a specific date and time.
-
Take backups of all databases from the production environment on the target system after all jobs have successfully loaded data from backed up source data.
-
Restore the source data in a testing environment, and run the new jobs.
-
Agree on a percentage of valid differences between the source and target (old and new) databases. For example, you might decide that a difference of less than 1% is acceptable.
-
List all the validation rules to be covered.
-
Automate the comparison as much as possible, and cover all the rules.