Zonal autoshift in ARC
With zonal autoshift, you authorize AWS to shift away resource traffic for an application from an Availability Zone (AZ) during events, on your behalf, to help reduce time to recovery. AWS starts an autoshift when internal telemetry indicates that there is an Availability Zone impairment that could potentially impact customers. When AWS starts an autoshift, application traffic to resources that you've configured for zonal autoshift starts shifting away from the Availability Zone.
Be aware that ARC does not inspect the health of individual resources. AWS starts an autoshift when AWS telemetry detects that there is an Availability Zone impairment that could potentially impact customers. In some cases, traffic might be shifted away for resources that are not experiencing impact.
With zonal autoshift, you also authorize AWS to shift away resource traffic for an application
from an Availability Zone, on your behalf, for regular practice runs. Practice runs are required
for zonal autoshift. The zonal shifts that ARC starts for practice runs help you to ensure that
shifting away traffic from an Availability Zone during an autoshift is safe for your application. Practice
runs regularly test that your application can operate normally without one Availability Zone by starting zonal
shifts that shift traffic for a resource away from an Availability Zone. Practice runs take place
weekly, and provide an outcome—such as SUCCEEDED
or FAILED
—to
help you understand if the application operates as expected.
Important
Before you configure practice runs or enable zonal autoshift, we strongly recommend that you pre-scale your application resource capacity in all Availability Zones in the Region where your application resources are deployed. You should not rely on scaling on demand when an autoshift or practice run starts. Zonal autoshift, including practice runs, works independently, and does not wait for auto scaling actions to complete. Relying on auto scaling, instead of pre-scaling, can result in it taking longer for your application to recover.
If you use auto scaling to handle regular cycles of traffic, we strongly recommend that you configure the minimum capacity of your auto scaling to continue operating normally with the loss of an Availability Zone.
If you plan to enable zonal autoshift or configure practice runs, after you pre-scale your application resource capacity, test that your application can operate normally without one Availability Zone. To test this, start a zonal shift to move traffic for a resource away from an Availability Zone.
To ensure your tests with zonal shift are effective, it's important to validate that traffic drains as expected from the AZ you shift away from. For example, both Application Load Balancers and Network Load Balancers provide per AZ metrics in Amazon CloudWatch that you can use to monitor this. Depending on how long a service and clients reuse connections, traffic might continue to the AZ that you have shifted away from for longer than you expect. To learn more, see Limit the time that clients stay connected to your endpoints.
After you verify, by starting and evaluating a zonal shift, that your application can continue operating normally with traffic shifted away from an Availability Zone, the regular practice runs that ARC performs help you to confirm, on an ongoing basis, that you have enough capacity for an autoshift.
In addition to enabling zonal autoshift for a supported resource in the ARC console, you have the option to instead enable zonal autoshift for a specific load balancer in the Amazon EC2 console. To learn more about enabling zonal autoshift with Elastic Load Balancing, see Zonal shift in the Elastic Load Balancing User Guide.
Autoshifts and practice run zonal shifts are temporary. With autoshifts, when the affected Availability Zone recovers, AWS stops shifting traffic for resources away from the Availability Zone. Application traffic for customers returns to all Availability Zones in the Region. With a practice run, traffic is shifted away from an Availability Zone for a single resource for about 30 minutes, and then shifted back to all Availability Zones in the Region.
You can configure Amazon EventBridge notifications to alert you about autoshifts and practice runs. For more information, see Using zonal autoshift with Amazon EventBridge.
About zonal autoshift
Zonal autoshift is a capability where AWS shifts application resource traffic away from an Availability Zone, on your behalf. AWS starts an autoshift when internal telemetry indicates that there is an Availability Zone impairment that could potentially impact customers. The internal telemetry incorporates metrics from several sources, including the AWS network, and the Amazon EC2 and Elastic Load Balancing services.
You must manually enable zonal autoshift for supported AWS resources.
When you deploy and run AWS applications on load balancers in multiple (typically three) AZs in a Region, and you pre-scale to support static stability, AWS can quickly recover customer applications in an AZ by shifting traffic away with an autoshift. By shifting away resource traffic to other AZs in the Region, AWS can reduce the duration and severity of potential impact caused by power outages, hardware or software issues in an AZ, or other impairments.
ARC's supported resources provide integrations that mark the specified AZ as unhealthy, which results in a traffic shifting away from the impaired AZ.
When you enable zonal autoshift for a resource, you must also configure a practice run for the resource. AWS performs practice runs about weekly, for 30 minutes, to help you make sure that you have enough capacity to run your application without one of the Availability Zones in the Region.
As with zonal shift, there are a few specific scenarios where zonal autoshift does not shift traffic away from the AZ. For example, if the load balancer target groups in the AZs don't have any instances, or if all of the instances are unhealthy, then the load balancer is in a fail open state and you can't shift away one of the AZs.
To learn more about zonal autoshift, see Zonal autoshift in ARC.