Summary Prerequisites and limitations Architecture Tools Epics Related resources

Automate cross-Region failover and failback by using DR Orchestrator Framework

Created by Jitendra Kumar (AWS), Oliver Francis (AWS), and Pavithra Balasubramanian (AWS)

Code repository: aws-cross-region-dr-databases	Environment: Production	Technologies: Databases; Infrastructure; Migration; Modernization
AWS services: Amazon Aurora; AWS CloudFormation; Amazon ElastiCache; Amazon RDS; AWS Step Functions

Summary

This pattern describes how to use DR Orchestrator Framework to orchestrate and automate the manual, error-prone steps to perform disaster recovery across Amazon Web Services (AWS) Regions. The pattern covers the following databases:

Amazon Relational Database Service (Amazon RDS) for MySQL, Amazon RDS for PostgreSQL, or Amazon RDS for MariaDB
Amazon Aurora MySQL-Compatible Edition or Amazon Aurora PostgreSQL-Compatible Edition (using a centralized file)
Amazon ElastiCache (Redis OSS)

To demonstrate the functionality of DR Orchestrator Framework, you create two DB instances or clusters. The primary is in the AWS Region us-east-1, and the secondary is in us-west-2. To create these resources, you use the AWS CloudFormation templates in the App-Stack folder of the aws-cross-region-dr-databases GitHub repository.

Prerequisites and limitations

General prerequisites

DR Orchestrator Framework deployed in both primary and secondary AWS Regions
Two Amazon Simple Storage Service buckets
A virtual private cloud (VPC) with two subnets and an AWS security group

Engine-specific prerequisites

Amazon Aurora – At least one Aurora global database must be available in two AWS Regions. You can use us-east-1 as the primary Region, and use us-west-2 as the secondary Region.
Amazon ElastiCache (Redis OSS) – An ElastiCache global datastore must be available in two AWS Regions. You can use us-east-1 as the primary Region, and use us-west-2 as the secondary Region.

Amazon RDS limitations

DR Orchestrator Framework doesn't check the replication lag before doing a failover or failback. Replication lag must be checked manually.
This solution has been tested using a primary database instance with one read replica. If you want to use more than one read replica, test the solution thoroughly before implementing it in a production environment.

Aurora limitations

Feature availability and support vary across specific versions of each database engine and across AWS Regions. For more information on feature and Region availability for cross-Region replication, see Cross-Region read replicas.
Aurora global databases have specific configuration requirements for supported Aurora DB instance classes and the maximum number of AWS Regions. For more information, see Configuration requirements of an Amazon Aurora global database.
This solution has been tested using a primary database instance with one read replica. If you want to use more than one read replica, test the solution thoroughly before implementing it in a production environment.

ElastiCache limitations

For information about Region availability for Global Datastore and ElastiCache configuration requirements, see Prerequisites and limitations in the ElastiCache documentation.

Amazon RDS product versions

Amazon RDS supports the following engine versions:

MySQL – Amazon RDS supports DB instances running the following versions of MySQL: MySQL 8.0 and MySQL 5.7
PostgreSQL – For information about supported versions of Amazon RDS for PostgreSQL, see Available PostgreSQL database versions.
MariaDB – Amazon RDS supports DB instances running the following versions of MariaDB:
- MariaDB 10.11
- MariaDB 10.6
- MariaDB 10.5

Aurora product versions

Amazon Aurora global database switchover requires Aurora MySQL-Compatible with MySQL 5.7 compatibility, version 2.09.1 and higher
For more information, see Limitations of Amazon Aurora global databases.

ElastiCache (Redis OSS) product versions

Amazon ElastiCache (Redis OSS) supports the following Redis versions:

Redis 7.1 (enhanced)
Redis 7.0 (enhanced)
Redis 6.2 (enhanced)
Redis 6.0 (enhanced)
Redis 5.0.6 (enhanced)

For more information, see Supported ElastiCache (Redis OSS) versions.

Architecture

Amazon RDS architecture

The Amazon RDS architecture includes the following resources:

The primary Amazon RDS DB instance created in the primary Region (us-east-1) with read/write access for clients
An Amazon RDS read replica created in the secondary Region (us-west-2) with read-only access for clients
DR Orchestrator Framework deployed in both the primary and secondary Regions

Diagram of two-Region RDS architecture in a single AWS account.

The diagram shows the following:

Asynchronous replication between the primary instance and the secondary instance
Read/write access for clients in the primary Region
Read-only access for clients in the secondary Region

Aurora architecture

The Amazon Aurora architecture includes the following resources:

The primary Aurora DB cluster created in the primary Region (us-east-1) with an active-writer endpoint
An Aurora DB cluster created in the secondary Region (us-west-2) with an inactive-writer endpoint
DR Orchestrator Framework deployed in both the primary and secondary Regions

Diagram of two-Region Aurora deployment in a single AWS account.

The diagram shows the following:

Asynchronous replication between the primary cluster and the secondary cluster
The primary DB cluster with an active-writer endpoint
The secondary DB cluster with an inactive-writer endpoint

ElastiCache (Redis OSS) architecture

The Amazon ElastiCache (Redis OSS) architecture includes the following resources:

An ElastiCache (Redis OSS) global datastore created with two clusters:
1. The primary cluster in the primary Region (us-east-1)
2. The secondary cluster in the secondary Region (us-west-2)
An Amazon cross-Region link with TLS 1.2 encryption between the two clusters
DR Orchestrator Framework deployed in both primary and secondary Regions

Diagram of a two-Region ElastiCache deployment with Amazon cross-Region link.

Automation and scale

DR Orchestrator Framework is scalable and supports the failover or failback of more than one AWS database in parallel.

You can use the following payload code to fail over multiple AWS databases in your account. In this example, three AWS databases (two global databases such as Aurora MySQL-Compatible or Aurora PostgreSQL-Compatible, and one Amazon RDS for MySQL instance) fail over to the DR Region:


{
  "StatePayload": [
    {
      "layer": 1,
      "resources": [
        {
          "resourceType": "PlannedFailoverAurora",
          "resourceName": "Switchover (planned failover) of Amazon Aurora global databases (MySQL)",
          "parameters": {
            "GlobalClusterIdentifier": "!Import dr-globaldb-cluster-mysql-global-identifier",
            "DBClusterIdentifier": "!Import dr-globaldb-cluster-mysql-cluster-identifier" 
          }
        },
        {
          "resourceType": "PlannedFailoverAurora",
          "resourceName": "Switchover (planned failover) of Amazon Aurora global databases (PostgreSQL)",
          "parameters": {
            "GlobalClusterIdentifier": "!Import dr-globaldb-cluster-postgres-global-identifier",
            "DBClusterIdentifier": "!Import dr-globaldb-cluster-postgres-cluster-identifier" 
          }
        },
        {
          "resourceType": "PromoteRDSReadReplica",
          "resourceName": "Promote RDS for MySQL Read Replica",
          "parameters": {
            "RDSInstanceIdentifier": "!Import rds-mysql-instance-identifier",
            "TargetClusterIdentifier": "!Import rds-mysql-instance-global-arn"
          }
        }         
      ]
    }
  ]
}

Tools

AWS services

Amazon Aurora is a fully managed relational database engine that's built for the cloud and compatible with MySQL and PostgreSQL.
Amazon ElastiCache helps you set up, manage, and scale distributed in-memory cache environments in the AWS Cloud. This pattern uses Amazon ElastiCache (Redis OSS).
AWS Lambda is a compute service that helps you run code without needing to provision or manage servers. It runs your code only when needed and scales automatically, so you pay only for the compute time that you use. In this pattern, Lambda functions are used by AWS Step Functions to perform the steps.
Amazon Relational Database Service (Amazon RDS) helps you set up, operate, and scale a relational database in the AWS Cloud. This pattern supports Amazon RDS for MySQL, Amazon RDS for PostgreSQL, and Amazon RDS for MariaDB.
AWS SDK for Python (Boto3) helps you integrate your Python application, library, or script with AWS services. In this pattern, Boto3 APIs are used to communicate with the database instances or global databases.
AWS Step Functions is a serverless orchestration service that helps you combine AWS Lambda functions and other AWS services to build business-critical applications. In this pattern, Step Functions state machines are used to orchestrate and run the cross-Region failover and failback of the database instances or global databases.

Code repository

The code for this pattern is available in the aws-cross-region-dr-databases repository on GitHub.

Epics

Task	Description	Skills required
Clone the GitHub repository.	To clone the repository, run the following command: `git clone https://github.com/aws-samples/aws-cross-region-dr-databases.git`	AWS DevOps, AWS administrator
Package Lambda functions code in a .zip file archive.	Create the archive files for Lambda functions to include the DR Orchestrator Framework dependencies: `cd <YOUR-LOCAL-GIT-FOLDER>/DR-Orchestration-artifacts bash scripts/deploy-orchestrator-sh.sh`	AWS administrator
Create S3 buckets.	S3 buckets are needed to store DR Orchestrator Framework along with your latest configuration. Create two S3 buckets, one in the primary Region (`us-east-1`), and one in the secondary Region (`us-west-2`): `dr-orchestrator-xxxxxx-us-east-1` `dr-orchestrator-xxxxxx-us-west-2` Replace `xxxxxx` with a random value to make the bucket names unique.	AWS administrator
Create subnets and security groups.	In both the primary Region (`us-east-1`) and the secondary Region (`us-west-2`), create two subnets and one security group for Lambda function deployment in your VPC: `subnet-XXXXXXX` `subnet-YYYYYYY` `sg-XXXXXXXXXXXX`	AWS administrator
Update the DR Orchestrator parameter files.	In the `<YOUR-LOCAL-GIT-FOLDER>/DR-Orchestration-artifacts/cloudformation` folder, update the following DR Orchestrator parameter files: `Orchestrator-Deployer-parameters-us-east-1.json` `Orchestrator-Deployer-parameters-us-west-2.json` Use the following parameter values, replacing `x` and `y` with the names of your resources: `[ { "ParameterKey": "TemplateStoreS3BucketName", "ParameterValue": "dr-orchestrator-xxxxxx-us-east-1" }, { "ParameterKey": "TemplateVPCId", "ParameterValue": "vpc-xxxxxx" }, { "ParameterKey": "TemplateLambdaSubnetID1", "ParameterValue": "subnet-xxxxxx" }, { "ParameterKey": "TemplateLambdaSubnetID2", "ParameterValue": "subnet-yyyyyy" }, { "ParameterKey": "TemplateLambdaSecurityGroupID", "ParameterValue": "sg-xxxxxxxxxx" } ]`	AWS administrator
Upload the DR Orchestrator Framework code to the S3 bucket.	The code will be safer in an S3 bucket than in the local directory. Upload the `DR-Orchestration-artifacts` directory, including all files and subfolders, to the S3 buckets. To upload the code, do the following: Sign in to the AWS Management Console. Navigate to the Amazon S3 console. Select the `dr-orchestrator-xxxxxx-us-east-1 bucket`. Choose Upload, and then choose Add folder. Select the `DR-Orchestration-artifacts` folder. Choose Upload. Select the `dr-orchestrator-xxxxxx-us-west-2` bucket. Repeat steps 4–7.	AWS administrator
Deploy DR Orchestrator Framework in the primary Region.	To deploy DR Orchestrator Framework in the primary Region (`us-east-1`), run the following commands: `cd <YOUR-LOCAL-GIT-FOLDER>/DR-Orchestration-artifacts/cloudformation aws cloudformation deploy \ --region us-east-1 \ --stack-name dr-orchestrator \ --template-file Orchestrator-Deployer.yaml \ --parameter-overrides file://Orchestrator-Deployer-parameters-us-east-1.json \ --capabilities CAPABILITY_AUTO_EXPAND CAPABILITY_NAMED_IAM CAPABILITY_IAM \ --disable-rollback`	AWS administrator
Deploy DR Orchestrator Framework in the secondary Region.	In the secondary Region (`us-west-2`), run the following commands: `cd <YOUR-LOCAL-GIT-FOLDER>/DR-Orchestration-artifacts/cloudformation aws cloudformation deploy \ --region us-west-2 \ --stack-name dr-orchestrator \ --template-file Orchestrator-Deployer.yaml \ --parameter-overrides file://Orchestrator-Deployer-parameters-us-west-2.json \ --capabilities CAPABILITY_AUTO_EXPAND CAPABILITY_NAMED_IAM CAPABILITY_IAM \ --disable-rollback`	AWS administrator
Verify the deployment.	If the AWS CloudFormation command runs successfully, it returns the following output: `Successfully created/updated stack - dr-orchestrator` Alternatively, you can navigate to the AWS CloudFormation console and verify the status of the `dr-orchestrator` stack.	AWS administrator

Task	Description	Skills required
Create the database subnets and security groups.	In your VPC, create two subnets and one security group for the DB instance or global database in both the primary (`us-east-1`) and the secondary (`us-west-2`) Regions: `subnet-XXXXXX` `subnet-XXXXXX` `sg-XXXXXXXXXX`	AWS administrator
Update the parameter file for the primary DB instance or cluster.	In the `<YOUR LOCAL GIT FOLDER>/App-Stack` folder, update the parameter file for the primary Region. Amazon RDS In the `RDS-MySQL-parameter-us-east-1.json` file, update `SubnetIds` and `DBSecurityGroup` with the names of resources that you created: `{ "Parameters": { "SubnetIds": "subnet-xxxxxx,subnet-xxxxxx", "DBSecurityGroup": "sg-xxxxxxxxxx", "MySqlGlobalIdentifier":"rds-mysql-instance", "InitialDatabaseName": "mysqldb", "DBPortNumber": "3789", "PrimaryRegion": "us-east-1", "SecondaryRegion": "us-west-2", "KMSKeyAliasName": "rds/rds-mysql-instance-KmsKeyId" } }` Amazon Aurora In the `Aurora-MySQL-parameter-us-east-1.json` file, update `SubnetIds` and `DBSecurityGroup` with the names of resources that you created: `{ "Parameters": { "SubnetIds": "subnet1-xxxxxx,subnet2-xxxxxx", "DBSecurityGroup": "sg-xxxxxxxxxx", "GlobalClusterIdentifier":"dr-globaldb-cluster-mysql", "DBClusterName":"dbcluster-01", "SourceDBClusterName":"dbcluster-02", "DBPortNumber": "3787", "DBInstanceClass":"db.r5.large", "InitialDatabaseName": "sampledb", "PrimaryRegion": "us-east-1", "SecondaryRegion": "us-west-2", "KMSKeyAliasName": "rds/dr-globaldb-cluster-mysql-KmsKeyId" } }` Amazon ElastiCache (Redis OSS) In the `ElastiCache-parameter-us-east-1.json` file, update `SubnetIds` and `DBSecurityGroup` with the names of resources that you created. `{ "Parameters": { "CacheNodeType": "cache.m5.large", "DBSecurityGroup": "sg-xxxxxxxxxx", "SubnetIds": "subnet-xxxxxx,subnet-xxxxxx", "EngineVersion": "5.0.6", "GlobalReplicationGroupIdSuffix": "demo-redis-global-datastore", "NumReplicas": "1", "NumShards": "1", "ReplicationGroupId": "demo-redis-cluster", "DBPortNumber": "3788", "TransitEncryption": "true", "KMSKeyAliasName": "elasticache/demo-redis-global-datastore-KmsKeyId", "PrimaryRegion": "us-east-1", "SecondaryRegion": "us-west-2" } }`	AWS administrator
Deploy your DB instance or cluster in the primary Region.	To deploy your instance or cluster in the primary Region (`us-east-1`), run the following commands based on your database engine. Amazon RDS `cd <YOUR-LOCAL-GIT-FOLDER>/App-Stack aws cloudformation deploy \ --region us-east-1 \ --stack-name rds-mysql-app-stack \ --template-file RDS-MySQL-Primary.yaml \ --parameter-overrides file://RDS-MySQL-parameter-us-east-1.json \ --capabilities CAPABILITY_AUTO_EXPAND CAPABILITY_NAMED_IAM CAPABILITY_IAM \ --disable-rollback` Amazon Aurora `cd <YOUR-LOCAL-GIT-FOLDER>/App-Stack aws cloudformation deploy \ --region us-east-1 \ --stack-name aurora-mysql-app-stack \ --template-file Aurora-MySQL-Primary.yaml \ --parameter-overrides file://Aurora-MySQL-parameter-us-east-1.json \ --capabilities CAPABILITY_AUTO_EXPAND CAPABILITY_NAMED_IAM CAPABILITY_IAM \ --disable-rollback` Amazon ElastiCache (Redis OSS) `cd <YOUR-LOCAL-GIT-FOLDER>/App-Stack aws cloudformation deploy \ --region us-east-1 --stack-name elasticache-ds-app-stack \ --template-file ElastiCache-Primary.yaml \ --parameter-overrides file://ElastiCache-parameter-us-east-1.json \ --capabilities CAPABILITY_AUTO_EXPAND CAPABILITY_NAMED_IAM CAPABILITY_IAM \ --disable-rollback` Verify that the AWS CloudFormation resources deployed successfully.	AWS administrator
Update the parameter file for the secondary DB instance or cluster.	In the `<YOUR LOCAL GIT FOLDER>/App-Stack` folder, update the parameter file for the secondary Region. Amazon RDS In the `RDS-MySQL-parameter-us-west-2.json` file, update `SubnetIDs` and `DBSecurityGroup` with the names of resources that you created. Update the `PrimaryRegionKMSKeyArn` with the value of `MySQLKmsKeyId` taken from the Outputs section of the AWS CloudFormation stack for the primary DB instance: `{ "Parameters": { "SubnetIds": "subnet-aaaaaaaaa,subnet-bbbbbbbbb", "DBSecurityGroup": "sg-cccccccccc", "MySqlGlobalIdentifier":"rds-mysql-instance", "InitialDatabaseName": "mysqldb", "DBPortNumber": "3789", "PrimaryRegion": "us-east-1", "SecondaryRegion": "us-west-2", "KMSKeyAliasName": "rds/rds-mysql-instance-KmsKeyId", "PrimaryRegionKMSKeyArn":"arn:aws:kms:us-east-1:xxxxxxxxx:key/mrk-xxxxxxxxxxxxxxxxxxxxx" } }` Amazon Aurora In the `Aurora-MySQL-parameter-us-west-2.json` file, update `SubnetIDs` and `DBSecurityGroup` with the names of resources you created. Update the `PrimaryRegionKMSKeyArn` with the value of `AuroraKmsKeyId` taken from the Outputs section of the AWS CloudFormation stack for the primary DB instance: `{ "Parameters": { "SubnetIds": "subnet1-aaaaaaaaa,subnet2-bbbbbbbbb", "DBSecurityGroup": "sg-cccccccccc", "GlobalClusterIdentifier":"dr-globaldb-cluster-mysql", "DBClusterName":"dbcluster-01", "SourceDBClusterName":"dbcluster-02", "DBPortNumber": "3787", "DBInstanceClass":"db.r5.large", "InitialDatabaseName": "sampledb", "PrimaryRegion": "us-east-1", "SecondaryRegion": "us-west-2", "KMSKeyAliasName": "rds/dr-globaldb-cluster-mysql-KmsKeyId" } }` Amazon ElastiCache (Redis OSS) In the `ElastiCache-parameter-us-west-2.json` file, update `SubnetIDs` and `DBSecurityGroup` with the names of resources that you created. Update the `PrimaryRegionKMSKeyArn` with the value of `ElastiCacheKmsKeyId` taken from the Outputs section of the AWS CloudFormation stack for the primary DB instance: `{ "Parameters": { "CacheNodeType": "cache.m5.large", "DBSecurityGroup": "sg-cccccccccc", "SubnetIds": "subnet-aaaaaaaaa,subnet-bbbbbbbbb", "EngineVersion": "5.0.6", "GlobalReplicationGroupIdSuffix": "demo-redis-global-datastore", "NumReplicas": "1", "NumShards": "1", "ReplicationGroupId": "demo-redis-cluster", "DBPortNumber": "3788", "TransitEncryption": "true", "KMSKeyAliasName": "elasticache/demo-redis-global-datastore-KmsKeyId", "PrimaryRegion": "us-east-1", "SecondaryRegion": "us-west-2" } }`	AWS administrator
Deploy your DB instance or cluster in the secondary Region.	Run the following commands, based on your database engine. Amazon RDS `cd <YOUR-LOCAL-GIT-FOLDER>/App-Stack aws cloudformation deploy \ --region us-west-2 \ --stack-name rds-mysql-app-stack \ --template-file RDS-MySQL-DR.yaml \ --parameter-overrides file://RDS-MySQL-parameter-us-west-2.json \ --capabilities CAPABILITY_AUTO_EXPAND CAPABILITY_NAMED_IAM CAPABILITY_IAM \ --disable-rollback` Amazon Aurora `cd <YOUR-LOCAL-GIT-FOLDER>/App-Stack aws cloudformation deploy \ --region us-west-2 \ --stack-name aurora-mysql-app-stack \ --template-file Aurora-MySQL-DR.yaml \ --parameter-overrides file://Aurora-MySQL-parameter-us-west-2.json \ --capabilities CAPABILITY_AUTO_EXPAND CAPABILITY_NAMED_IAM CAPABILITY_IAM \ --disable-rollback` Amazon ElastiCache (Redis OSS) `cd <YOUR-LOCAL-GIT-FOLDER>/App-Stack aws cloudformation deploy \ --region us-west-2 \ --stack-name elasticache-ds-app-stack \ --template-file ElastiCache-DR.yaml \ --parameter-overrides file://ElastiCache-parameter-us-west-2.json \ --capabilities CAPABILITY_AUTO_EXPAND CAPABILITY_NAMED_IAM CAPABILITY_IAM \ --disable-rollback` Verify that the AWS CloudFormation resources deployed successfully.	AWS administrator

Related resources

Disaster recovery strategy for databases on AWS (AWS Prescriptive Guidance strategy)
Automate your DR solution for relational databases on AWS (AWS Prescriptive Guidance guide)
Using Amazon Aurora global databases
Replication across AWS Regions using global datastores
Automate your DR solution for relational databases on AWS (AWS Prescriptive Guidance guide)

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Assess query performance for migrating SQL Server databases to MongoDB Atlas on AWS

Automate the replication of Amazon RDS instances across AWS accounts