Target tracking scaling policies for Amazon EC2 Auto Scaling - Amazon EC2 Auto Scaling

Target tracking scaling policies for Amazon EC2 Auto Scaling

A target tracking scaling policy automatically scales the capacity of your Auto Scaling group based on a target metric value. This allows your application to maintain optimal performance and cost efficiency without manual intervention.

With target tracking, you select a metric and a target value to represent the ideal average utilization or throughput level for your application. Amazon EC2 Auto Scaling creates and manages the CloudWatch alarms that trigger scaling events when the metric deviates from the target. This is similar to how a thermostat maintains a target temperature.

For example, let's say that you currently have an application that runs on two instances, and you want the CPU utilization of the Auto Scaling group to stay at around 50 percent when the load on the application changes. This gives you extra capacity to handle traffic spikes without maintaining an excessive number of idle resources.

You can meet this need by creating a target tracking scaling policy that targets an average CPU utilization of 50 percent. Then, your Auto Scaling group will scale out (increase capacity) when CPU exceeds 50 percent to handle increased load. It will scale in (decrease capacity) when CPU drops below 50 percent to optimize costs during periods of low utilization.

Multiple target tracking scaling policies

To help optimize scaling performance, you can use multiple target tracking scaling policies together, provided that each of them uses a different metric. For example, utilization and throughput can influence each other. Whenever one of these metrics changes, it usually implies that other metrics will also be impacted. The use of multiple metrics therefore provides additional information about the load that your Auto Scaling group is under and improves decision making when determining how much capacity to add to your group.

The intention of Amazon EC2 Auto Scaling is to always prioritize availability, so its behavior differs depending on whether the target tracking policies are ready for scale out or scale in. It will scale out the Auto Scaling group if any of the target tracking policies are ready for scale out, but will scale in only if all of the target tracking policies (with the scale-in portion enabled) are ready to scale in.

Choose metrics

You can create target tracking scaling policies with either predefined metrics or custom metrics.

When you create a target tracking scaling policy with a predefined metric type, you choose one metric from the following list of predefined metrics:

  • ASGAverageCPUUtilization—Average CPU utilization of the Auto Scaling group.

  • ASGAverageNetworkIn—Average number of bytes received by a single instance on all network interfaces.

  • ASGAverageNetworkOut—Average number of bytes sent out from a single instance on all network interfaces.

  • ALBRequestCountPerTarget—Average Application Load Balancer request count per target.

Important

Other valuable information about the metrics for CPU utilization, network I/O, and Application Load Balancer request count per target can be found in the List the available CloudWatch metrics for your instances topic in the Amazon EC2 User Guide for Linux Instances and the CloudWatch metrics for your Application Load Balancer topic in the User Guide for Application Load Balancers, respectively.

You can choose other available CloudWatch metrics or your own metrics in CloudWatch by specifying a custom metric. You must use the AWS CLI or an SDK to create a target tracking policy with a customized metric specification. For an example that specifies a customized metric specification for a target tracking scaling policy using the AWS CLI, see Example scaling policies for the AWS Command Line Interface (AWS CLI).

Keep the following in mind when choosing a metric:

  • We recommend that you only use metrics that are available at one-minute intervals to help you scale faster in response to utilization changes. Target tracking will evaluate metrics aggregated at a one-minute granularity for all predefined metrics and custom metrics, but the underlying metric might publish data less frequently. For example, all Amazon EC2 metrics are sent in five-minute intervals by default, but they are configurable to one minute (known as detailed monitoring). This choice is up to the individual services. Most try to use the smallest interval possible. For information about enabling detailed monitoring, see Configure monitoring for Auto Scaling instances.

  • Not all custom metrics work for target tracking. The metric must be a valid utilization metric and describe how busy an instance is. The metric value must increase or decrease proportionally to the number of instances in the Auto Scaling group. That's so the metric data can be used to proportionally scale out or in the number of instances. For example, the CPU utilization of an Auto Scaling group works (that is, the Amazon EC2 metric CPUUtilization with the metric dimension AutoScalingGroupName), if the load on the Auto Scaling group is distributed across the instances.

  • The following metrics do not work for target tracking:

    • The number of requests received by the load balancer fronting the Auto Scaling group (that is, the Elastic Load Balancing metric RequestCount). The number of requests received by the load balancer doesn't change based on the utilization of the Auto Scaling group.

    • Load balancer request latency (that is, the Elastic Load Balancing metric Latency). Request latency can increase based on increasing utilization, but doesn't necessarily change proportionally.

    • The CloudWatch Amazon SQS queue metric ApproximateNumberOfMessagesVisible. The number of messages in a queue might not change proportionally to the size of the Auto Scaling group that processes messages from the queue. However, a custom metric that measures the number of messages in the queue per EC2 instance in the Auto Scaling group can work. For more information, see Scaling based on Amazon SQS.

  • To use the ALBRequestCountPerTarget metric, you must specify the ResourceLabel parameter to identify the load balancer target group that is associated with the metric. For an example that specifies the ResourceLabel parameter for a target tracking scaling policy using the AWS CLI, see Example scaling policies for the AWS Command Line Interface (AWS CLI).

  • When a metric emits real 0 values to CloudWatch (for example, ALBRequestCountPerTarget), an Auto Scaling group can scale in to 0 when there is no traffic to your application for a sustained period of time. To have your Auto Scaling group scale in to 0 when no requests are routed it, the group's minimum capacity must be set to 0.

  • Instead of publishing new metrics to use in your scaling policy, you can use metric math to combine existing metrics. For more information, see Create a target tracking scaling policy for Amazon EC2 Auto Scaling using metric math.

Define target value

When you create a target tracking scaling policy, you must specify a target value. The target value represents the optimal average utilization or throughput for the Auto Scaling group. To use resources cost efficiently, set the target value as high as possible with a reasonable buffer for unexpected traffic increases. When your application is optimally scaled out for a normal traffic flow, the actual metric value should be at or just below the target value.

When a scaling policy is based on throughput, such as the request count per target for an Application Load Balancer, network I/O, or other count metrics, the target value represents the optimal average throughput from a single instance, for a one-minute period.

Define instance warm-up time

You can optionally specify the number of seconds that it takes for a newly launched instance to warm up. Until its specified warm-up time has expired, an instance is not counted toward the aggregated EC2 instance metrics of the Auto Scaling group.

While instances are in the warm-up period, your scaling policies only scale out if the metric value from instances that are not warming up is greater than the policy's target utilization.

If the group scales out again, the instances that are still warming up are counted as part of the desired capacity for the next scale-out activity. The intention is to continuously (but not excessively) scale out.

While the scale-out activity is in progress, all scale-in activities initiated by scaling policies are blocked until the instances finish warming up. When the instances finish warming up, if a scale-in event occurs, any instances currently in the process of terminating will be counted towards the current capacity of the group when calculating the new desired capacity. Therefore, we don't remove more instances from the Auto Scaling group than necessary.

Default value

If no value is set, then the scaling policy will use the default value, which is the value for the default instance warmup defined for the group. If the default instance warmup is null, then it falls back to the value of the default cooldown. We recommend using the default instance warmup to make it easier to update all scaling policies when the warm-up time changes.

Considerations

The following considerations apply when working with target tracking scaling policies:

  • Do not create, edit, or delete the CloudWatch alarms that are used with a target tracking scaling policy. Amazon EC2 Auto Scaling creates and manages the CloudWatch alarms that are associated with your target tracking scaling policies and deletes them when no longer needed.

  • A target tracking scaling policy prioritizes availability during periods of fluctuating traffic levels by scaling in more gradually when traffic is decreasing. If you want your Auto Scaling group to scale in immediately when a workload finishes, you can disable the scale-in portion of the policy. This provides you with the flexibility to use the scale-in method that best meets your needs when utilization is low. To ensure that scale in happens as quickly as possible, we recommend not using a simple scaling policy to prevent a cooldown period from being added.

  • If the metric is missing data points, this causes the CloudWatch alarm state to change to INSUFFICIENT_DATA. When this happens, Amazon EC2 Auto Scaling cannot scale your group until new data points are found.

  • If the metric is sparsely reported by design, metric math can be helpful. For example, to use the most recent values, then use the FILL(m1,REPEAT) function where m1 is the metric.

  • You might see gaps between the target value and the actual metric data points. This is because we act conservatively by rounding up or down when determining how many instances to add or remove. This prevents us from adding an insufficient number of instances or removing too many instances. However, for smaller Auto Scaling groups with fewer instances, the utilization of the group might seem far from the target value. For example, let's say that you set a target value of 50 percent for CPU utilization and your Auto Scaling group then exceeds the target. We might determine that adding 1.5 instances will decrease the CPU utilization to close to 50 percent. Because it is not possible to add 1.5 instances, we round up and add two instances. This might decrease the CPU utilization to a value below 50 percent, but it ensures that your application has enough resources to support it. Similarly, if we determine that removing 1.5 instances increases your CPU utilization to above 50 percent, we remove just one instance.

    For larger Auto Scaling groups with more instances, the utilization is spread over a larger number of instances, in which case adding or removing instances causes less of a gap between the target value and the actual metric data points.

  • A target tracking scaling policy assumes that it should scale out your Auto Scaling group when the specified metric is above the target value. You can't use a target tracking scaling policy to scale out your Auto Scaling group when the specified metric is below the target value.

Create a target tracking scaling policy

To create a target tracking scaling policy for your Auto Scaling group, use one of the following methods.

Before you begin, confirm that your preferred metric is available at 1-minute intervals (compared to the default 5-minute interval of Amazon EC2 metrics).

Console
To create a target tracking scaling policy for a new Auto Scaling group
  1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/, and choose Auto Scaling Groups from the navigation pane.

  2. Choose Create Auto Scaling group.

  3. In Steps 1, 2, and 3, choose the options as desired and proceed to Step 4: Configure group size and scaling policies.

  4. Under Scaling, specify the range that you want to scale between by updating the Min desired capacity and Max desired capacity. These two settings allow your Auto Scaling group to scale dynamically. For more information, see Set scaling limits for your Auto Scaling group.

  5. Under Automatic scaling, choose Target tracking scaling policy.

  6. To define a policy, do the following:

    1. Specify a name for the policy.

    2. For Metric type, choose a metric.

      If you chose Application Load Balancer request count per target, choose a target group in Target group.

    3. Specify a Target value for the metric.

    4. (Optional) For Instance warmup, update the instance warm-up value as needed.

    5. (Optional) Select Disable scale in to create only a scale-out policy. This allows you to create a separate scale-in policy of a different type if wanted.

  7. Proceed to create the Auto Scaling group. Your scaling policy will be created after the Auto Scaling group has been created.

To create a target tracking scaling policy for an existing Auto Scaling group
  1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/, and choose Auto Scaling Groups from the navigation pane.

  2. Select the check box next to your Auto Scaling group.

    A split pane opens up in the bottom of the page.

  3. Verify that the scaling limits are appropriately set. For example, if your group's desired capacity is already at its maximum, you need to specify a new maximum in order to scale out. For more information, see Set scaling limits for your Auto Scaling group.

  4. On the Automatic scaling tab, in Dynamic scaling policies, choose Create dynamic scaling policy.

  5. To define a policy, do the following:

    1. For Policy type, keep the default of Target tracking scaling.

    2. Specify a name for the policy.

    3. For Metric type, choose a metric. You can choose only one metric type. To use more than one metric, create multiple policies.

      If you chose Application Load Balancer request count per target, choose a target group in Target group.

    4. Specify a Target value for the metric.

    5. (Optional) For Instance warmup, update the instance warm-up value as needed.

    6. (Optional) Select Disable scale in to create only a scale-out policy. This allows you to create a separate scale-in policy of a different type if wanted.

  6. Choose Create.

AWS CLI

To create a target tracking scaling policy, you can use the following example to help you get started. Replace each user input placeholder with your own information.

To create a target tracking scaling policy (AWS CLI)
  1. Use the following cat command to store a target value for your scaling policy and a predefined metric specification in a JSON file named config.json in your home directory. The following is an example target tracking configuration that keeps the average CPU utilization at 50 percent.

    $ cat ~/config.json { "TargetValue": 50.0, "PredefinedMetricSpecification": { "PredefinedMetricType": "ASGAverageCPUUtilization" } }

    For more information, see PredefinedMetricSpecification in the Amazon EC2 Auto Scaling API Reference.

  2. Use the put-scaling-policy command, along with the config.json file that you created in the previous step, to create your scaling policy.

    aws autoscaling put-scaling-policy --policy-name cpu50-target-tracking-scaling-policy \ --auto-scaling-group-name my-asg --policy-type TargetTrackingScaling \ --target-tracking-configuration file://config.json

    If successful, this command returns the ARNs and names of the two CloudWatch alarms created on your behalf.

    { "PolicyARN": "arn:aws:autoscaling:us-west-2:123456789012:scalingPolicy:228f02c2-c665-4bfd-aaac-8b04080bea3c:autoScalingGroupName/my-asg:policyName/cpu50-target-tracking-scaling-policy", "Alarms": [ { "AlarmARN": "arn:aws:cloudwatch:us-west-2:123456789012:alarm:TargetTracking-my-asg-AlarmHigh-fc0e4183-23ac-497e-9992-691c9980c38e", "AlarmName": "TargetTracking-my-asg-AlarmHigh-fc0e4183-23ac-497e-9992-691c9980c38e" }, { "AlarmARN": "arn:aws:cloudwatch:us-west-2:123456789012:alarm:TargetTracking-my-asg-AlarmLow-61a39305-ed0c-47af-bd9e-471a352ee1a2", "AlarmName": "TargetTracking-my-asg-AlarmLow-61a39305-ed0c-47af-bd9e-471a352ee1a2" } ] }