Automatic Scaling for Spot Fleet
Automatic scaling is the ability to increase or decrease the target capacity of your Spot fleet automatically based on demand. A Spot fleet can either launch instances (scale out) or terminate instances (scale in), within the range that you choose, in response to one or more scaling policies. We recommend that you create two policies, one for scaling out and one for scaling in.
A scaling policy uses CloudWatch alarms to trigger the scaling process.
For example, if you want to scale out when CPU utilization reaches a certain level,
create an alarm using the
CPUUtilization metric provided by Amazon EC2.
When you create a scaling policy, you must specify one of the following scaling adjustment types:
Add — Increase the target capacity of the fleet by a specified number of capacity units or a specified percentage of the current capacity.
Remove — Decrease the target capacity of the fleet by a specified number of capacity units or a specified percentage of the current capacity.
Set to — Set the target capacity of the fleet to the specified number of capacity units.
When an alarm is triggered, the auto scaling process calculates the new target capacity using the fulfilled capacity and the scaling policy, and then updates the target capacity accordingly. For example, suppose that the target capacity and fulfilled capacity are 10 and the scaling policy adds 1. When the alarm is triggered, the auto scaling process adds 1 to 10 to get 11, so Spot fleet launches 1 instance.
If you are using instance weighting, keep in mind that Spot fleet can exceed the target capacity as needed, and that fulfilled capacity can be a floating-point number but target capacity must be an integer, so Spot fleet rounds up to the next integer. You must take these behaviors into account when you look at the outcome of a scaling policy when an alarm is triggered. For example, suppose that the target capacity is 30, the fulfilled capacity is 30.1, and the scaling policy subtracts 1. When the alarm is triggered, the auto scaling process subtracts 1 from 30.1 to get 29.1 and then rounds it up to 30, so no scaling action is taken. As another example, suppose that you selected instance weights of 2, 4, and 8, and a target capacity of 10, but no weight 2 instances were available so Spot fleet provisioned instances of weights 4 and 8 for a fulfilled capacity of 12. If the scaling policy decreases target capacity by 20% and an alarm is triggered, the auto scaling process subtracts 12*.02 from 12 to get 9.6 and then rounds it up to 10, so no scaling action is taken.
You can also configure the cooldown period for a scaling policy. This is the number of seconds after a scaling activity completes where previous trigger-related scaling activities can influence future scaling events. For scale out policies, while the cooldown period is in effect, the capacity that has been added by the previous scale out event that initiated the cooldown is calculated as part of the desired capacity for the next scale out. The intention is to continuously (but not excessively) scale out. For scale in policies, the cooldown period is used to block subsequent scale in requests until it has expired. The intention is to scale in conservatively to protect your application's availability. However, if another alarm triggers a scale out policy during the cooldown period after a scale-in, auto scaling scales out your scalable target immediately.
Note that when a Spot fleet terminates an instance because the target capacity was decreased, the instance receives a Spot instance termination notice.
The Spot fleet request must have a request type of
maintain. Automatic scaling is not supported for one-time requests or Spot blocks.
Consider which CloudWatch metrics are important to your application. You can create CloudWatch alarms based on metrics provided by AWS or your own custom metrics.
For the AWS metrics that you will use in your scaling policies, enable CloudWatch metrics collection if the service that provides the metrics does not enable it by default.
If you use the AWS Management Console to enable automatic scaling for your Spot fleet, it creates a role named
aws-ec2-spot-fleet-autoscale-rolethat grants Auto Scaling permission to describe the alarms for your policies, monitor the current capacity of the fleet, and modify the capacity of the fleet. If you configure automatic scaling using the AWS CLI or an API, you can use this role if it exists, or manually create your own role for this purpose as follows.
Open the IAM console at https://console.aws.amazon.com/iam/.
In the navigation pane, choose Roles.
Choose Create New Role.
On the Set Role Name page, type a name for the role and then choose Next Step.
On the Select Role Type page, choose Select next to Amazon EC2.
On the Attach Policy page, select the
AmazonEC2SpotFleetAutoscaleRolepolicy and then choose Next Step.
On the Review page, choose Create Role.
Select the role that you just created.
On the Trust Relationships tab, choose Edit Trust Relationship.
application-autoscaling.amazonaws.comand then choose Update Trust Policy.
To create a CloudWatch alarm
Open the CloudWatch console at https://console.aws.amazon.com/cloudwatch/.
In the navigation pane, choose Alarms.
Choose Create Alarm.
For CloudWatch Metrics by Category, choose a category. For example, choose EC2 Spot Metrics, Fleet Request Metrics.
Select a metric, and then choose Next.
For Alarm Threshold, type a name and description for the alarm, and set the threshold value and number of time periods for the alarm.
(Optional) To receive notification of a scaling event, for Actions, choose New list and type your email address. Otherwise, you can delete the notification now and add one later if needed.
Choose Create Alarm.
To configure automatic scaling for your Spot fleet using the console
Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.
In the navigation pane, choose Spot Requests.
Select your Spot fleet request, and then choose the Auto Scaling tab.
If automatic scaling is not configured, choose Configure.
Use Scale capacity between to set the minimum and maximum capacity for your fleet. Automatic scaling will not scale your fleet below the minimum capacity or above the maximum capacity.
Initially, Scaling policies contains policies named ScaleUp and ScaleDown. You can complete these policies, or choose Remove policy to delete them. You can also choose Add policy to add a policy.
To define a policy, do the following:
For Policy name, type a name for the policy.
For Policy trigger, select an existing alarm or choose Create new alarm to open the Amazon CloudWatch console and create an alarm.
For Modify capacity, select a scaling adjustment type, select a number, and select a unit.
(Optional) To perform step scaling, choose Define steps. By default, an add policy has a lower bound of -infinity and an upper bound of the alarm threshold. By default, a remove policy has a lower bound of the alarm threshold and an upper bound of +infinity. To add another step, choose Add step.
(Optional) To modify the default value for the cooldown period, select a number from Cooldown period.
To configure automatic scaling for your Spot fleet using the AWS CLI