Overview of Amazon RDS Blue/Green Deployments for Aurora - Amazon Aurora

Overview of Amazon RDS Blue/Green Deployments for Aurora

By using Amazon RDS Blue/Green Deployments, you can make and test database changes before implementing them in a production environment. A blue/green deployment creates a staging environment that copies the production environment. In a blue/green deployment, the blue environment is the current production environment. The green environment is the staging environment. The staging environment stays in sync with the current production environment using logical replication.

You can make changes to the Aurora DB cluster in the green environment without affecting production workloads. For example, you can upgrade the major or minor DB engine version or change database parameters in the staging environment. You can thoroughly test changes in the green environment. When ready, you can switch over the environments to promote the green environment to be the new production environment. The switchover typically takes under a minute with no data loss and no need for application changes.

Because the green environment is a copy of the topology of the production environment, the DB cluster and all of its DB instances are copied in the deployment. The green environment also includes the features used by the DB cluster, such as DB cluster snapshots, Performance Insights, Enhanced Monitoring, and Aurora Serverless v2.

Note

Blue/Green Deployments are supported for Aurora MySQL and Aurora PostgreSQL. For Amazon RDS availability, see Using Amazon RDS Blue/Green Deployments for database updates in the Amazon RDS User Guide.

Region and version availability

Feature availability and support varies across specific versions of each database engine, and across AWS Regions. For more information, see Supported Regions and Aurora DB engines for Blue/Green Deployments.

Benefits of using Amazon RDS Blue/Green Deployments

By using Amazon RDS Blue/Green Deployments, you can stay current on security patches, improve database performance, and adopt newer database features with short, predictable downtime. Blue/green deployments reduce the risks and downtime for database updates, such as major or minor engine version upgrades.

Blue/green deployments provide the following benefits:

  • Easily create a production-ready staging environment.

  • Automatically replicate database changes from the production environment to the staging environment.

  • Test database changes in a safe staging environment without affecting the production environment.

  • Stay current with database patches and system updates.

  • Implement and test newer database features.

  • Switch over your staging environment to be the new production environment without changes to your application.

  • Safely switch over through the use of built-in switchover guardrails.

  • Eliminate data loss during switchover.

  • Switch over quickly, typically under a minute depending on your workload.

Workflow of a blue/green deployment

Complete the following major steps when you use a blue/green deployment for Aurora DB cluster updates.

  1. Identify a production DB cluster that requires updates.

    The following image shows an example of a production DB cluster.

    Production (blue) Aurora DB cluster in a blue/green deployment
  2. Create the blue/green deployment. For instructions, see Creating a blue/green deployment.

    The following image shows an example of a blue/green deployment of the production environment from step 1. While creating the blue/green deployment, RDS copies the complete topology and configuration of the Aurora DB cluster to create the green environment. The names of the copied DB cluster and DB instances are appended with -green-random-characters. The staging environment in the image contains the DB cluster (auroradb-green-abc123). It also contains the three DB instances in the DB cluster (auroradb-instance1-green-abc123, auroradb-instance2-green-abc123, and auroradb-instance3-green-abc123).

    Blue/green deployment for Amazon Aurora

    When you create the blue/green deployment, you can specify a higher DB engine version and a different DB cluster parameter group for the DB cluster in the green environment. You can also specify a different DB parameter group for the DB instances in the DB cluster.

    RDS also configures replication from the primary DB instance in the blue environment to the primary DB instance in the green environment.

    Important

    For Aurora MySQL version 3, after you create the blue/green deployment, the DB cluster in the green environment allows write operations by default. We recommend that you make the DB cluster read-only by setting the read_only parameter to 1 and rebooting the cluster.

  3. Make changes to the staging environment.

    For example, you might make schema changes to your database or change the DB instance class used by one or more DB instances in the green environment.

    For information about modifying a DB cluster, see Modifying an Amazon Aurora DB cluster.

  4. Test your staging environment.

    During testing, we recommend that you keep your databases in the green environment read only. Enable write operations on the green environment with caution because they can result in replication conflicts. They can also result in unintended data in the production databases after switchover. To enable write operations for Aurora MySQL, set the read_only parameter to 0, then reboot the DB instance. For Aurora PostgreSQL, set the default_transaction_read_only parameter to off at the session level.

  5. When ready, switch over to promote the staging environment to be the new production environment. For instructions, see Switching a blue/green deployment.

    The switchover results in downtime. The downtime is usually under one minute, but it can be longer depending on your workload.

    The following image shows the DB clusters after the switchover.

    DB cluster and DB instances after switching over an Amazon Aurora blue/green deployment

    After the switchover, the Aurora DB cluster in the green environment becomes the new production DB cluster. The names and endpoints in the current production environment are assigned to the newly promoted production environment, requiring no changes to your application. As a result, your production traffic now flows to the new production environment. The DB cluster and DB instances in the blue environment are renamed by appending -oldn to the current name, where n is a number. For example, assume the name of the DB instance in the blue environment is auroradb-instance-1. After switchover, the DB instance name might be auroradb-instance-1-old1.

    In the example in the image, the following changes occur during switchover:

    • The green environment DB cluster auroradb-green-abc123 becomes the production DB cluster named auroradb.

    • The green environment DB instance named auroradb-instance1-green-abc123 becomes the production DB instance auroradb-instance1.

    • The green environment DB instance named auroradb-instance2-green-abc123 becomes the production DB instance auroradb-instance2.

    • The green environment DB instance named auroradb-instance3-green-abc123 becomes the production DB instance auroradb-instance3.

    • The blue environment DB cluster named auroradb becomes auroradb-old1.

    • The blue environment DB instance named auroradb-instance1 becomes auroradb-instance1-old1.

    • The blue environment DB instance named auroradb-instance2 becomes auroradb-instance2-old1.

    • The blue environment DB instance named auroradb-instance3 becomes auroradb-instance3-old1.

  6. If you no longer need a blue/green deployment, you can delete it. For instructions, see Deleting a blue/green deployment.

    After switchover, the previous production environment isn't deleted so that you can use it for regression testing, if necessary.

Authorizing access to blue/green deployment operations

Users must have the required permissions to perform operations related to blue/green deployments. You can create IAM policies that grant users and roles permission to perform specific API operations on the specified resources they need. You can then attach those policies to the IAM permission sets or roles that require those permissions. For more information, see Identity and access management for Amazon Aurora.

The user who creates a blue/green deployment must have permissions to perform the following RDS operations:

  • rds:AddTagsToResource

  • rds:CreateDBCluster

  • rds:CreateDBInstance

  • rds:CreateDBClusterEndpoint

The user who switches over a blue/green deployment must have permissions to perform the following RDS operations:

  • rds:ModifyDBCluster

  • rds:PromoteReadReplicaDBCluster

The user who deletes a blue/green deployment must have permissions to perform the following RDS operations:

  • rds:DeleteDBCluster

  • rds:DeleteDBInstance

  • rds:DeleteDBClusterEndpoint

Aurora provisions and modifies resources in the staging environment on your behalf. These resources include DB instances that use an internally defined naming convention. Therefore, attached IAM policies can't contain partial resource name patterns such as my-db-prefix-*. Only wildcards (*) are supported. In general, we recommend using resource tags and other supported attributes to control access to these resources, rather than wildcards. For more information, see Actions, resources, and condition keys for Amazon RDS.

Considerations for blue/green deployments

Amazon RDS tracks resources in blue/green deployments with the DbiResourceId and DbClusterResourceId of each resource. This resource ID is an AWS Region-unique, immutable identifier for the resource.

The resource ID is separate from the DB cluster ID:

Create blue/green deployment

The name (cluster ID) of a resource changes when you switch over a blue/green deployment, but each resource keeps the same resource ID. For example, a DB cluster identifier might have been mycluster in the blue environment. After switchover, the same DB cluster might be renamed to mycluster-old1. However, the resource ID of the DB cluster doesn't change during switchover. So, when the green resources are promoted to be the new production resources, their resource IDs don't match the blue resource IDs that were previously in production.

After switching over a blue/green deployment, consider updating the resource IDs to those of the newly promoted production resources for integrated features and services that you used with the production resources. Specifically, consider the following updates:

  • If you perform filtering using the RDS API and resource IDs, adjust the resource IDs used in filtering after switchover.

  • If you use CloudTrail for auditing resources, adjust the consumers of the CloudTrail to track the new resource IDs after switchover. For more information, see Monitoring Amazon Aurora API calls in AWS CloudTrail.

  • If you use Database Activity Streams for resources in the blue environment, adjust your application to monitor database events for the new stream after switchover. For more information, see Supported Regions and Aurora DB engines for database activity streams.

  • If you use the Performance Insights API, adjust the resource IDs in calls to the API after switchover. For more information, see Monitoring DB load with Performance Insights on Amazon Aurora.

    You can monitor a database with the same name after switchover, but it doesn't contain the data from before the switchover.

  • If you use resource IDs in IAM policies, make sure you add the resource IDs of the newly promoted resources when necessary. For more information, see Identity and access management for Amazon Aurora.

  • If you have IAM roles associated with your DB cluster, make sure to reassociate them after switchover. Attached roles aren't automatically copied to the green environment.

  • If you authenticate to your DB cluster using IAM database authentication, make sure that the IAM policy used for database access has both the blue and the green databases listed under the Resource element of the policy. This is required in order to connect to the green database after switchover. For more information, see Creating and using an IAM policy for IAM database access.

  • If you want to restore a manual DB cluster snapshot for a DB cluster that was part of a blue/green deployment, make sure you restore the correct DB cluster snapshot by examining the time when the snapshot was taken. For more information, see Restoring from a DB cluster snapshot.

  • Amazon Aurora creates the green environment by cloning the underlying Aurora storage volume in the blue environment. The green cluster volume only stores incremental changes made to the green environment. If you delete the DB cluster in the blue environment, the size of the underlying Aurora storage volume in the green environment grows to the full size. For more information, see Cloning a volume for an Amazon Aurora DB cluster.

  • When you add a DB instance to the DB cluster in the green environment of a blue/green deployment, the new DB instance won't replace a DB instance in the blue environment when you switch over. However, the new DB instance is retained in the DB cluster and becomes a DB instance in the new production environment.

  • When you delete a DB instance in the DB cluster in the green environment of a blue/green deployment, you can't create a new DB instance to replace it in the blue/green deployment.

    If you create a new DB instance with the same name and ARN as the deleted DB instance, it has a different DbiResourceId, so it isn't part of the green environment.

    The following behavior results if you delete a DB instance in the DB cluster in the green environment:

    • If the DB instance in the blue environment with the same name exists, it won't be switched over to the DB instance in the green environment. This DB instance won't be renamed by appending -oldn to the DB instance name.

    • Any application that points to the DB instance in the blue environment continues to use the same DB instance after switchover.

Best practices for blue/green deployments

The following are best practices for blue/green deployments:

General best practices

  • Thoroughly test the Aurora DB cluster in the green environment before switching over.

  • Keep your databases in the green environment read only. We recommend that you enable write operations on the green environment with caution because they can result in replication conflicts. They can also result in unintended data in the production databases after switchover.

  • When using a blue/green deployment to implement schema changes, make only replication-compatible changes.

    For example, you can add new columns at the end of a table without disrupting replication from the blue deployment to the green deployment. However, schema changes, such as renaming columns or renaming tables, break replication to the green deployment.

    For more information about replication-compatible changes, see Replication with Differing Table Definitions on Source and Replica in the MySQL documentation and Restrictions in the PostgreSQL logical replication documentation.

  • Use the cluster endpoint, reader endpoint, or custom endpoint for all connections in both environments. Don't use instance endpoints or custom endpoints with static or exclusion lists.

  • When you switch over a blue/green deployment, follow the switchover best practices. For more information, see Switchover best practices.

Aurora PostgreSQL best practices

  • Monitor the Aurora PostgreSQL logical replication write-through cache and make adjustments to the cache buffer if necessary. For more information, see Managing the Aurora PostgreSQL logical replication write-through cache.

  • If your database has sufficient freeable memory, increase the value of the logical_decoding_work_mem DB parameter in the blue environment. Doing so allows for less decoding on disk and instead uses memory. You can monitor freeable memory with the FreeableMemory CloudWatch metric. For more information, see Amazon CloudWatch metrics for Amazon Aurora.

  • Update all of your PostgreSQL extensions to the latest version before you create a blue/green deployment. For more information, see Upgrading PostgreSQL extensions.

  • If you’re using the aws_s3 extension, make sure to give the green DB cluster access to Amazon S3 through an IAM role after the green environment is created. This allows the import and export commands to continue functioning after switchover. For instructions, see Setting up access to an Amazon S3 bucket.

  • If you specify a higher engine version for the green environment, run the ANALYZE operation on all databases to refresh the pg_statistic table. Optimizer statistics aren't transferred during a major version upgrade, so you must regenerate all statistics to avoid performance issues. For additional best practices during major version upgrades, see How to perform a major version upgrade.

  • Avoid configuring triggers as ENABLE REPLICA or ENABLE ALWAYS if the trigger is used on the source to manipulate data. Otherwise, the replication system propagates changes and executes the trigger, which leads to duplication.

  • Long-running transactions can cause significant replica lag. To reduce replica lag, consider doing the following:

    • Reduce long-running transactions that can be delayed until after the green environment catches up to the blue environment.

    • Initiate a manual vacuum freeze operation on busy tables prior to creating the blue/green deployment.

    • For PostgreSQL version 12 and higher, disable the index_cleanup parameter on large or busy tables to increase the rate of normal maintenance on blue databases.

  • Slow replication can cause senders and receivers to restart often, which delays synchronization. To ensure that they remain active, disable timeouts by setting the wal_sender_timeout parameter to 0 in the blue environment, and the wal_receiver_timeout parameter to 0 in the green environment.

Limitations for blue/green deployments

The following limitations apply to blue/green deployments.

General limitations for blue/green deployments

The following general limitations apply to blue/green deployments:

  • Aurora MySQL versions 2.08 and 2.09 aren't supported as upgrade source or target versions.

  • You can't stop and start a cluster that is part of a blue/green deployment.

  • Blue/green deployments don't support managing master user passwords with AWS Secrets Manager.

  • If you create a blue/green deployment from an Aurora MySQL source DB cluster that has backtrack enabled, the green DB cluster is created without backtracking support. This is because backtracking doesn't work with binary log (binlog) replication, which is required for blue/green deployments. For more information, see Backtracking an Aurora DB cluster.

    If you attempt to force a backtrack on the blue DB cluster, the blue/green deployment breaks and switchover is blocked.

  • For Aurora MySQL, the source DB cluster can't contain any databases named tmp. Databases with this name will not be copied to the green environment.

  • For Aurora PostgreSQL, unlogged tables aren't replicated to the green environment unless the rds.logically_replicate_unlogged_tables parameter is set to 1 on the blue DB cluster. We recommend that you don't modify this parameter value after you create a blue/green deployment to avoid possible replication errors on unlogged tables.

  • For Aurora PostgreSQL, the blue environment DB cluster can't be a self-managed logical source (publisher) or replica (subscriber). For Aurora MySQL, the blue environment DB cluster can't be an external binlog replica.

  • During switchover, the blue and green environments can't have zero-ETL integrations with Amazon Redshift. You must delete the integration first and switch over, then recreate the integration.

  • The Event Scheduler (event_scheduler parameter) must be disabled on the green environment when you create a blue/green deployment. This prevents events from being generated in the green environment and causing inconsistencies.

  • Any Aurora Auto Scaling policies that are defined on the blue DB cluster aren't copied to the green environment.

  • Blue/green deployments don't support the AWS JDBC Driver for MySQL. For more information, see Known Limitations on GitHub.

  • Blue/green deployments aren't supported for the following features:

    • Amazon RDS Proxy

    • Cross-Region read replicas

    • Aurora Serverless v1 DB clusters

    • DB clusters that are part of an Aurora global database

    • Babelfish for Aurora PostgreSQL

    • AWS CloudFormation

PostgreSQL extension limitations for blue/green deployments

The following limitations apply to PostgreSQL extensions:

  • The pg_partman extension must be disabled on the blue environment when you create a blue/green deployment. The extension performs DDL operations such as CREATE TABLE, which break logical replication from the blue environment to the green environment.

  • The pg_cron extension must remain disabled on all green databases after the blue/green deployment is created. The extension has background workers that run as superuser and bypass the read-only setting of the green environment, which might cause replication conflicts.

  • The apg_plan_mgmt extension must have the apg_plan_mgmt.capture_plan_baselines parameter set to off on all green databases to avoid primary key conflicts if an identical plan is captured in the blue environment. For more information, see Overview of Aurora PostgreSQL query plan management.

    If you want to capture execution plans in Aurora Replicas, you must provide the blue DB cluster endpoint when calling the apg_plan_mgmt.create_replica_plan_capture function. This ensures that plan captures continue to work after switchover. For more information, see Capturing Aurora PostgreSQL execution plans in Replicas.

  • If the blue DB cluster is configured as the foreign server of a foreign data wrapper (FDW) extension, you must use the cluster endpoint name instead of IP addresses. This allows the configuration to remain functional after switchover.

  • The pglogical and pg_active extensions must be disabled on the blue environment when you create a blue/green deployment. After you promote the green environment to be the new production environment, you can enable the extensions again. In addition, the blue database can’t be a logical subscriber of an external instance.

  • If you're using the pgAudit extension, it must remain in the shared libraries (shared_preload_libraries) on the custom DB parameter groups for both the blue and the green DB instances. For more information, see Setting up the pgAudit extension.

Limitations for changes in blue/green deployments

The following are limitations for changes in a blue/green deployment:

  • You can't change an unencrypted DB cluster into an encrypted DB cluster.

  • You can't change an encrypted DB cluster into an unencrypted DB cluster.

  • You can't change a blue environment DB cluster to a higher engine version than its corresponding green environment DB cluster.

  • The resources in the blue environment and green environment must be in the same AWS account.

  • If the blue environment contains any Aurora Auto Scaling policies, these policies aren't copied over to the green environment. You must manually re-add the policies to the green environment.

PostgreSQL logical replication limitations for blue/green deployments

Blue/green deployments use logical replication to keep the staging environment in sync with the production environment. PostgreSQL has certain restrictions related to logical replication, which translate to limitations when creating blue/green deployments for Aurora PostgreSQL DB clusters.

The following table describes logical replication limitations that apply to blue/green deployments for Aurora PostgreSQL.

Limitation Explanation
Data definition language (DDL) statements, such as CREATE TABLE and CREATE SCHEMA, aren't replicated from the blue environment to the green environment.

If Aurora detects a DDL change in the blue environment, your green databases enter a state of Replication degraded.

You receive an event notifying you that DDL changes in the blue environment can't be replicated to the green environment. You must delete the blue/green deployment and all green databases, then recreate it. Otherwise, you won't be able to switch over the blue/green deployment.

NEXTVAL operations on sequence objects aren't synchronized between the blue environment and the green environment.

During switchover, Aurora increments sequence values in the green environment to match those in the blue environment. If you have thousands of sequences, this can delay switchover.

Creation or modification of large objects in the blue environment aren't replicated to the green environment.

If Aurora detects the creation or modification of large objects in the blue environment that are stored in the pg_largeobject system table, your green databases enter a state of Replication degraded.

Aurora generates an event notifying you that large object changes in the blue environment can't be replicated to the green environment. You must delete the blue/green deployment and all green databases, then recreate it. Otherwise, you won't be able to switch over the blue/green deployment.

Materialized views aren’t automatically refreshed on the green environment.

Refreshing materialized views in the blue environment doesn't refresh them in the green environment. After switchover, you can schedule a refresh of materialized views.

UPDATE and DELETE operations aren't permitted on tables that don't have a primary key.

Before you create a blue/green deployment, make sure that all tables in the DB cluster have a primary key.

For more information, see Restrictions in the PostgreSQL logical replication documentation.