Fast Recovery After Failover with Cluster Cache Management for Aurora PostgreSQL - Amazon Aurora

Fast Recovery After Failover with Cluster Cache Management for Aurora PostgreSQL

For fast recovery of the writer DB instance in your Aurora PostgreSQL clusters if there's a failover, use cluster cache management for Amazon Aurora PostgreSQL. Cluster cache management ensures that application performance is maintained if there's a failover.

In a typical failover situation, you might see a temporary but large performance degradation after failover. This degradation occurs because when the failover DB instance starts, the buffer cache is empty. An empty cache is also known as a cold cache. A cold cache degrades performance because the DB instance has to read from the slower disk, instead of taking advantage of values stored in the buffer cache.

With cluster cache management, you set a specific reader DB instance as the failover target. Cluster cache management ensures that the data in the designated reader's cache is kept synchronized with the data in the writer DB instance's cache. The designated reader's cache with prefilled values is known as a warm cache. If a failover occurs, the designated reader uses values in its warm cache immediately when it's promoted to the new writer DB instance. This approach provides your application much better recovery performance.

Configuring Cluster Cache Management

Note

Cluster cache management is supported for Aurora PostgreSQL DB clusters of versions 9.6.11 and above, and versions 10.5 and above.

To configure cluster cache management, take the following steps.

Note

Allow at least 1 minute after completing these steps for cluster cache management to be fully operational.

Enable Cluster Cache Management

To enable cluster cache management for a DB cluster, modify its parameter group by setting the apg_ccm_enabled parameter to 1 as described following.

To enable cluster cache management

  1. Sign in to the AWS Management Console and open the Amazon RDS console at https://console.aws.amazon.com/rds/.

  2. In the navigation pane, choose Parameter groups.

  3. In the list, choose the parameter group for your Aurora PostgreSQL DB cluster.

    The DB cluster must use a parameter group other than the default, because you can't change values in a default parameter group.

  4. For Parameter group actions, choose Edit.

  5. Set the value of the apg_ccm_enabled cluster parameter to 1.

  6. Choose Save changes.

To enable cluster cache management for an Aurora PostgreSQL DB cluster, use the AWS CLI modify-db-cluster-parameter-group command with the following required parameters:

  • --db-cluster-parameter-group-name

  • --parameters

Example

For Linux, macOS, or Unix:

aws rds modify-db-cluster-parameter-group \ --db-cluster-parameter-group-name my-db-cluster-parameter-group \ --parameters "ParameterName=apg_ccm_enabled,ParameterValue=1,ApplyMethod=immediate"

For Windows:

aws rds modify-db-cluster-parameter-group ^ --db-cluster-parameter-group-name my-db-cluster-parameter-group ^ --parameters "ParameterName=apg_ccm_enabled,ParameterValue=1,ApplyMethod=immediate"

Set the Promotion Tier Priority for the Writer DB Instance

Make sure that the promotion priority is tier-0 for the writer DB instance of the Aurora PostgreSQL DB cluster. The promotion tier priority is a value that specifies the order in which an Aurora reader is promoted to the writer DB instance after a failure. Valid values are 0–15, where 0 is the first priority and 15 is the last priority. For more information about the promotion tier, see Fault Tolerance for an Aurora DB Cluster.

To set the promotion priority for the writer DB instance to tier-0

  1. Sign in to the AWS Management Console and open the Amazon RDS console at https://console.aws.amazon.com/rds/.

  2. In the navigation pane, choose Databases.

  3. Choose the Writer DB instance of the Aurora PostgreSQL DB cluster.

  4. Choose Modify. The Modify DB Instance page appears.

  5. On the Failover panel, choose tier-0 for the Priority.

  6. Choose Continue and check the summary of modifications.

  7. To apply the changes immediately after you save them, choose Apply immediately.

  8. Choose Modify DB Instance to save your changes.

To set the promotion tier priority to 0 for the writer DB instance using the AWS CLI, call the modify-db-instance command with the following required parameters:

  • --db-instance-identifier

  • --promotion-tier

  • --apply-immediately

Example

For Linux, macOS, or Unix:

aws rds modify-db-instance \ --db-instance-identifier writer-db-instance \ --promotion-tier 0 \ --apply-immediately

For Windows:

aws rds modify-db-instance ^ --db-instance-identifier writer-db-instance ^ ---promotion-tier 0 ^ --apply-immediately

Set the Promotion Tier Priority for a Reader DB Instance

You set one reader DB instance for cluster cache management. To do so, choose a reader from the Aurora PostgreSQL cluster that is the same instance class as the writer DB instance. Then set its promotion tier priority to 0.

The promotion tier priority is a value that specifies the order in which an Aurora reader is promoted to the writer DB instance after a failure. Valid values are 0–15, where 0 is the first priority and 15 is the last priority.

To set the promotion priority of the reader DB instance to tier-0

  1. Sign in to the AWS Management Console and open the Amazon RDS console at https://console.aws.amazon.com/rds/.

  2. In the navigation pane, choose Databases.

  3. Choose a Reader DB instance of the Aurora PostgreSQL DB cluster that is the same instance class as the writer DB instance.

  4. Choose Modify. The Modify DB Instance page appears.

  5. On the Failover panel, choose tier-0 for the Priority.

  6. Choose Continue and check the summary of modifications.

  7. To apply the changes immediately after you save them, choose Apply immediately.

  8. Choose Modify DB Instance to save your changes.

To set the promotion tier priority to 0 for the reader DB instance using the AWS CLI, call the modify-db-instance command with the following required parameters:

  • --db-instance-identifier

  • --promotion-tier

  • --apply-immediately

Example

For Linux, macOS, or Unix:

aws rds modify-db-instance \ --db-instance-identifier reader-db-instance \ --promotion-tier 0 \ --apply-immediately

For Windows:

aws rds modify-db-instance ^ --db-instance-identifier reader-db-instance ^ ---promotion-tier 0 ^ --apply-immediately

Monitoring the Buffer Cache

After setting up cluster cache management, you can monitor the state of synchronization between the writer DB instance's buffer cache and the designated reader's warm buffer cache. To examine the buffer cache contents on both the writer DB instance and the designated reader DB instance, use the PostgreSQL pg_buffercache module. For more information, see the PostgreSQL pg_buffercache documentation.

Using the aurora_ccm_status Function

Cluster cache management also provides the aurora_ccm_status function. Use the aurora_ccm_status function on the writer DB instance to get the following information about the progress of cache warming on the designated reader:

  • buffers_sent_last_minute – How many buffers have been sent to the designated reader in the last minute.

  • buffers_sent_last_scan – How many buffers have been sent to the designated reader during the last complete scan of the buffer cache.

  • buffers_found_last_scan – How many buffers have been identified as frequently accessed and needed to be sent during the last complete scan of the buffer cache. Buffers already cached on the designated reader aren't sent.

  • buffers_sent_current_scan – How many buffers have been sent so far during the current scan.

  • buffers_found_current_scan – How many buffers have been identified as frequently accessed in the current scan.

  • current_scan_progress – How many buffers have been visited so far during the current scan.

The following example shows how to use the aurora_ccm_status function to convert some of its output into a warm rate and warm percentage.

SELECT buffers_sent_last_minute*8/60 AS warm_rate_kbps, 100*(1.0-buffers_sent_last_scan/buffers_found_last_scan) AS warm_percent FROM aurora_ccm_status();