Determining the migration approach - AWS Prescriptive Guidance

Determining the migration approach

To decide on a migration approach, you use the analysis you performed on existing patterns in the previous phase. Your organization’s future data and analytics needs are equally important considerations. Traditional on-premises ETL tools deal with relational data models and structured data. If you have semi-structured and unstructured data to process, you can use AWS services such as AWS Glue or Amazon EMR for the migration. Other factors that can influence the migration approach include:

  • Whether you want to use a graphical interface (such as AWS Glue Studio) or a custom framework (such as Spark/Python libraries)

  • Whether you have secure access to on-premises sources and AWS targets

  • Skills and training required for the team

  • Audit and compliance requirements

You can select from three migration approaches: big bang, phased, and lift and shift. The following table compares these three approaches.

Approach Description Use case Advantages and disadvantages
Big bang Migrate all SSIS packages within a specific time period.
  • Complexity, scope, and target architecture are clear.

  • Team has the required skills, or the learning curve is shallow.

  • High risk.

  • Takes less time than the phased approach.

  • You can use AWS Glue, Amazon EMR, or custom frameworks.

Phased Identify one SSIS package for each distinct pattern and complexity. Migrate the package to AWS, test, and compare results with existing architecture.
  • Time is not a constraint.

  • You want different designs for different ETL patterns.

  • Less risky than the big bang approach but takes more time and effort.

  • You can use AWS Glue, Amazon EMR, or custom frameworks.

Lift and shift Migrate the current architecture as is to AWS.
  • Your on-premises hardware is no longer supported.

  • You don’t have the resources to plan a migration immediately.

  • Least amount of migration effort and time required.

  • The problems with the existing solution remain on AWS.

  • SSIS packages are run as is. No other ETL tools or frameworks are needed.

A comparison of data on the source and target systems is fundamental for a successful migration. Because the existing production system gets regular updates from source systems, this comparison might become confusing. For this reason, when you’re determining your migration approach, we recommend that you also decide on your data validation strategy.

  • Take backups of all applicable databases and files from the production environment on the source system at a specific date and time.

  • Take backups of all databases from the production environment on the target system after all jobs have successfully loaded data from backed up source data.

  • Restore the source data in a testing environment, and run the new jobs.

  • Agree on a percentage of valid differences between the source and target (old and new) databases. For example, you might decide that a difference of less than 1% is acceptable.

  • List all the validation rules to be covered.

  • Automate the comparison as much as possible, and cover all the rules.