Managing Spot Instance Interruptions - Overview of Amazon EC2 Spot Instances

Managing Spot Instance Interruptions

The best way for you to gracefully handle Spot Instance interruptions and minimize impact on your performance or availability is to architect your application to be fault-tolerant. To accomplish this, you can take advantage of EC2 instance rebalance recommendations and Spot Instance interruption notices.

An EC2 instance rebalance recommendation is a signal that notifies you when a Spot Instance is at elevated risk of interruption. The signal gives you the opportunity to proactively manage the Spot Instance in advance of the two-minute Spot Instance interruption notice. You can decide to rebalance your workload to new or existing Spot Instances that are not at an elevated risk of interruption. We've made it easy for you to use this signal by providing the Capacity Rebalancing feature in EC2 Auto Scaling groups. For more information, see Amazon EC2 Auto Scaling Capacity Rebalancing.

A Spot Instance interruption notice is a warning that is issued two minutes before Amazon EC2 interrupts a Spot Instance. If your workload is "time-flexible," you can configure your Spot Instances to be stopped or hibernated, instead of being terminated, when they are interrupted. Amazon EC2 automatically stops or hibernates your Spot Instances on interruption, and automatically resumes the instances when we have available capacity.

You can use the EC2 instance rebalance recommendation and/or the Spot Instance interruption notice to architect your workload with fault-tolerance in mind, so that you can capture notifications and save a job’s state to storage (for example, Amazon S3, Amazon EFS, or Amazon FSx), persist log files from the instance (or stream them continuously for a more fault-tolerant approach), drain connections from a Load Balancer, etc.

Some AWS and third-party services already handle Spot interruptions for you to decrease the impact on your application. For example, Amazon EKS running managed node groups with Spot Instances automatically launches replacement Kubernetes nodes when a rebalance recommendation or interruption notices are delivered for an existing node.