What is Amazon Route 53 Application Recovery Controller? - Amazon Route 53 Application Recovery Controller

What is Amazon Route 53 Application Recovery Controller?

Amazon Route 53 Application Recovery Controller provides two distinct capabilities: readiness checks and routing controls. You can use these features to give you insights into whether your applications and resources are prepared for recovery, and to help you manage and coordinate failover.

Route 53 ARC provides continual readiness checks to help make sure, on an ongoing basis, that your applications are scaled to handle failover traffic and configured so you can route around failures. Route 53 ARC helps you centrally coordinate failovers within an AWS Region or across multiple Regions. It provides extremely reliable routing control so you can recover applications by rerouting traffic, for example, across Availability Zones or Regions. To do this, you partition your applications into redundant failure-containment units, or replicas, called cells. The boundary of a cell can be an Availability Zone or a Region, or even a smaller unit within an Availability Zone.

The AWS Global Cloud Infrastructure provides high fault tolerance, with each AWS Region comprised of multiple Availability Zones, which are fully isolated. Route 53 ARC works within the AWS ecosystem to help your applications be resilient. You can support highly available applications on AWS by running two redundant replicas across Availability Zones and Regions. Then you can use Amazon Route 53 to route traffic to the appropriate replica.

Typically, one application replica is active and serves application traffic, while another is a standby replica. When your active replica has failures, you can scale up the standby replica (if needed), and then reroute user traffic there to restore availability to your application. Readiness checks can help you determine if you need to add capacity to a standby when you've scaled up a primary, for example. However, you should decide whether to fail away from or to a replica based on your monitoring and health check systems, and consider readiness checks as a complementary service to those systems.

If you want to enable faster recoveries, another option you can configure is to use an active-active implementation. With this approach, all of your replicas are active at the same time. This means that you can recover from failures by shifting users away from your impaired application replica by just rerouting traffic to another active replica, without taking time to scale up first.

Route 53 ARC also includes configurable safety rules that you can use to create guard rails for failover operations. Using these rules, you can make sure, for example that only one of your endpoints (active or standby) is enabled and in service at a time. Or you might limit the number of application replicas in an active-active configuration that are taken offline at once, so that automation can't reduce capacity below a certain level. These rules can help you avoid unintended consequences when you're working with routing control updates.

The features in Route 53 ARC help you prepare for and accomplish faster recovery operations for high availability applications running on AWS. Routing controls enable you to re-balance traffic across application replicas during failures, so that you can ensure that your application is available. Safety rules help protect you from poor outcomes by imposing guard rails that you define. Readiness checks continually monitor AWS resource quotas, capacity, and network routing policies, and can notify you about changes that would affect your ability to failover to a replica and recover. For example, you can set up EventBridge notifications to let you know when a readiness status changes, or you can view information about the readiness of your application resources and routing configuration in the AWS Management Console. Readiness checks help you to ensure that your standby environment is scaled and configured, so you're prepared for failure situations.