Spot Instances - Amazon Elastic Container Service

Spot Instances

Spot capacity can provide significant cost savings over on-demand instances. Spot capacity is excess capacity that's priced significantly lower than on-demand or reserved capacity. Spot capacity is suitable for batch processing and machine-learning workloads, and development and staging environments. More generally, it's suitable for any workload that tolerates temporary downtime.

Understand that the following consequences because Spot capacity might not be available all the time.

  • During periods of extremely high demand, Spot capacity might be unavailable. This can cause Amazon EC2 Spot instance launches to be delayed. In these events, Amazon ECS services retry launching tasks, and Amazon EC2 Auto Scaling groups also retry launching instances, until the required capacity becomes available. Amazon EC2 doesn't replace Spot capacity with on-demand capacity.

  • When the overall demand for capacity increases, Spot instances and tasks might be terminated with only a two-minute warning. After the warning is sent, tasks should begin an orderly shutdown if necessary before the instance is fully terminated. This helps minimize the possibility of errors. For more information about a graceful shutdown, see Graceful shutdowns with ECS.

To help minimize Spot capacity shortages, consider the following recommendations:

  • Use multiple Regions and Availability Zones - Spot capacity varies by Region and Availability Zone. You can improve Spot availability by running your workloads in multiple Regions and Availability Zones. If possible, specify subnets in all the Availability Zones in the Regions where you run your tasks and instances.

  • Use multiple Amazon EC2 instance types - When you use Mixed Instance Policies with Amazon EC2 Auto Scaling, multiple instance types are launched into your Auto Scaling Group. This ensures that a request for Spot capacity can be fulfilled when needed. To maximize reliability and minimize complexity, use instance types with roughly the same amount of CPU and memory in your Mixed Instances Policy. These instances can be from a different generation, or variants of the same base instance type. Note that they might come with additional features that you might not require. An example of such a list could include m4.large, m5.large, m5a.large, m5d.large, m5n.large, m5dn.large, and m5ad.large. For more information, see Auto Scaling groups with multiple instance types and purchase options in the Amazon EC2 Auto Scaling User Guide.

  • Use the capacity-optimized Spot allocation strategy - With Amazon EC2 Spot, you can choose between the capacity- and cost-optimized allocation strategies. If you choose the capacity-optimized strategy when launching a new instance, Amazon EC2 Spot selects the instance type with the greatest availability in the selected Availability Zone. This helps reduce the possibility that the instance is terminated soon after it launches.

Linux Spot Instance draining

Amazon EC2 terminates, stops, or hibernates your Spot Instance when the Spot price exceeds the maximum price for your request or capacity is no longer available. Amazon EC2 provides a Spot Instance two-minute interruption notice for terminate and stop actions. It does not provide the two-minute notice for the hibernate action. If Amazon ECS Spot Instance draining is enabled on the instance, ECS receives the Spot Instance interruption notice and places the instance in DRAINING status.

Important

Amazon ECS does not receive a notice from Amazon EC2 when instances are removed by Auto Scaling Capacity Rebalancing. For more information, see Amazon EC2 Auto Scaling Capacity Rebalancing.

When a container instance is set to DRAINING, Amazon ECS prevents new tasks from being scheduled for placement on the container instance. Service tasks on the draining container instance that are in the PENDING state are stopped immediately. If there are container instances in the cluster that are available, replacement service tasks are started on them.

Spot Instance draining is turned off by default and must be manually enabled. To enable Spot Instance draining for a new container instance, when launching the container instance add the following script into the User data field, replacing MyCluster with the name of the cluster to register the container instance to.

#!/bin/bash cat <<'EOF' >> /etc/ecs/ecs.config ECS_CLUSTER=MyCluster ECS_ENABLE_SPOT_INSTANCE_DRAINING=true EOF

For more information, see Launching an Amazon ECS Linux container instance.

To turn on Spot Instance draining for an existing container instance
  1. Connect to the Spot Instance over SSH.

  2. Edit the /etc/ecs/ecs.config file and add the following:

    ECS_ENABLE_SPOT_INSTANCE_DRAINING=true
  3. Restart the ecs service.

    • For the Amazon ECS-optimized Amazon Linux 2 AMI:

      sudo systemctl restart ecs
    • For the Amazon ECS-optimized Amazon Linux AMI:

      sudo stop ecs && sudo start ecs
  4. (Optional) You can verify that the agent is running and see some information about your new container instance by querying the agent introspection API operation. For more information, see Container introspection.

    curl http://localhost:51678/v1/metadata

Windows Spot Instance draining

Amazon EC2 terminates, stops, or hibernates your Spot Instance when the Spot price exceeds the maximum price for your request or capacity is no longer available. Amazon EC2 provides a Spot Instance interruption notice, which gives the instance a two-minute warning before it is interrupted. If Amazon ECS Spot Instance draining is enabled on the instance, ECS receives the Spot Instance interruption notice and places the instance in DRAINING status.

Important

Amazon ECS monitors for the Spot Instance interruption notices that have the terminate and stop instance-actions. If you specified either the hibernate instance interruption behavior when requesting your Spot Instances or Spot Fleet, then Amazon ECS Spot Instance draining is not supported for those instances.

When a container instance is set to DRAINING, Amazon ECS prevents new tasks from being scheduled for placement on the container instance. Service tasks on the draining container instance that are in the PENDING state are stopped immediately. If there are container instances in the cluster that are available, replacement service tasks are started on them.

You must set the ECS_ENABLE_SPOT_INSTANCE_DRAINING parameter before you start the container agent. Use the following commands to manually turn on Spot Instance draining. Substitute my-cluster with the name of your cluster.

[Environment]::SetEnvironmentVariable("ECS_ENABLE_SPOT_INSTANCE_DRAINING", "true", "Machine") # Initialize the agent Initialize-ECSAgent -Cluster my-cluster

For more information, see Launching an Amazon ECS Windows container instance.