Balancing an Amazon ECS service across Availability Zones - Amazon Elastic Container Service

Balancing an Amazon ECS service across Availability Zones

To help your applications achieve high availability, we recommend configuring your multi-task services to run across multiple Availability Zones. For services that specify their first placement strategy to be Availability Zone spread, AWS makes a best effort to evenly distribute service tasks across the available Availability Zones. However, there might be times when the number of tasks running in one Availability Zone is not the same as in other Availability Zones, such as after an Availability Zone disruption. To address this task imbalance, you can enable the Availability Zone rebalancing feature. With Availability Zone rebalancing, Amazon ECS continuously monitors the distribution of tasks across Availability Zones for each of your services. When Amazon ECS detects an uneven task distribution, it automatically takes action to rebalance the workload across Availability Zones. This involves launching new tasks in the Availability Zones with the fewest tasks and terminating tasks in the overloaded Availability Zones. This redistribution ensures no single Availability Zone becomes a point of failure, helping maintain the overall availability of your containerized applications. The automated rebalancing process eliminates the need for manual intervention, speeding the time to recovery after an event.

The following is an overview of the Availability Zone rebalancing process:

  1. Amazon ECS starts monitoring a service after it reaches the steady state, and looks at the number of tasks running in each Availability Zone.

  2. Amazon ECS performs the following operations when it detects an imbalance in the number of tasks running in each Availability Zone:

    • Sends a service event indicating that Availability Zone rebalancing is starting.

    • Starts tasks in Availability Zones with the fewest number of running tasks

    • Stops the tasks in Availability Zones with the largest number of running tasks.

    • The scheduler waits for the newly started tasks to be HEALTHY and RUNNING before stopping the tasks in the over-scaled Availability Zone.

    • Sends a service event with the Availability Zone rebalancing outcome.

Availability Zone rebalancing supports the Fargate and EC2 launch types. For Fargate, Amazon ECS will automatically redistribute tasks across available Availability Zones to maintain balance. For the EC2 launch type, Amazon ECS rebalances tasks across existing container instances on a best-effort basis, respecting your defined placement strategies and constraints. However, ECS cannot provision new instances in underutilized Availability Zones as part of the rebalancing process, limiting the rebalancing to existing container instances.

Availability Zone rebalancing works in the following configurations:

  • Services that use the Replica strategy

  • Services that specify Availability Zone spread as the first task placement strategy, or do not specify a placement strategy.

You can't use Availability Zone rebalancing with services that meet any of the following criteria:

  • Uses the Daemon strategy

  • Uses the EXTERNAL launch type (ECS Anywhere)

  • Uses 100% for the maximumPercent value

  • Uses a Classic Load Balancer

  • Uses the attribute:ecs.availability-zone as a task placement constraint

Placement strategies and placement constraints with Availability Zone rebalancing

Placement strategies determine how Amazon ECS selects container instances and Availability Zones for task placement termination. Task placement constraints are rules that determine whether a task is allowed to run on a specific container instance. For the EC2 launch type, you can use placement strategies and placement constraints in conjunction with Availability Zone rebalancing. However, for Availability Zone rebalancing to work, the Availability Zone spread placement strategy must be the first strategy specified. Availability Zone rebalancing is compatible with various placement strategy combinations. For example, you can create a strategy that first distributes tasks evenly across Availability Zones, and then bin packs tasks based on memory within each Availability Zone. In this case, Availability Zone rebalancing works because the Availability Zone spread strategy is specified first. It's important to note that Availability Zone rebalancing won't work if the first strategy in the placement strategy array is not an Availability Zone spread component. This requirement ensures that the primary focus of task distribution is maintaining balance across Availability Zones, which is crucial for high availability. For more information about task placement strategies and constraints, see How Amazon ECS places tasks on container instances.

The following example strategy distributes tasks evenly across Availability Zones, and then bin packs tasks based on memory within each Availability Zone. Availability Zone rebalancing is compatible with the service because the spread strategy is first.

"placementStrategy": [ { "field": "attribute:ecs.availability-zone", "type": "spread" }, { "field": "memory", "type": "binpack" } ]

Turn on Availability Zone rebalancing

You need to enable Availability Zone rebalancing for new and existing services.

You can enable and disable Availability Zone rebalancing using the console, APIs, or the AWS CLI.