Configure simplified automatic recovery
Important
The following information applies to configuring recovery-related capabilities on healthy instances. If you are currently encountering difficulties accessing your instance, see Troubleshoot EC2 instances.
For your workload to function properly after a successful instance recovery, your instance must boot and accept traffic without requiring manual intervention.
By default, simplified automatic recovery monitors all supported running instances. In the event that a system status check failure is detected, simplified automatic recovery attempts to remediate the instance to a healthy state. Simplified automatic recovery doesn't operate during service events in the AWS Health Dashboard. For more information, see Troubleshooting simplified automatic recovery failures.
When a simplified automatic recovery event occurs, you will receive an AWS Health Dashboard event. To configure notifications for these events, see the Getting Started with AWS User Notifications in the AWS User Notifications User Guide. You can also use Amazon EventBridge rules to monitor for simplified automatic recovery events using the following event codes:
-
AWS_EC2_SIMPLIFIED_AUTO_RECOVERY_SUCCESS
— successful events -
AWS_EC2_SIMPLIFIED_AUTO_RECOVERY_FAILURE
— failed events
For more information, see Amazon EventBridge rules.
Topics
Requirements and limitations for simplified automatic recovery
Simplified automatic recovery will attempt to recover an instance if it:
-
Is in the
running
state. For more information, see Amazon EC2 instance state changes. -
Uses
default
(On-Demand) ordedicated
tenancy. For more information, see Amazon EC2 billing and purchasing options. -
Is of an instance type for which Amazon EC2 has capacity available. In some situations, such as significant outages, not enough capacity will be available and some recovery attempts might fail.
-
Doesn't use
host
tenancy. For Amazon EC2 Dedicated Hosts, you can use Dedicated Host Auto Recovery to automatically recover unhealthy instances. -
Doesn't use an Elastic Fabric Adapter.
-
Isn't a
metal
instance size. -
Isn't a member of an Auto Scaling group.
-
Isn't currently undergoing a scheduled maintenance event.
-
Doesn't have instance store volumes.
-
Uses one of the following instance types:
-
General purpose: A1 | M3 | M4 | M5 | M5a | M5n | M5zn | M6a | M6g | M6i | M6in | M7a | M7g | M7i | M7i-flex | M8g | T1 | T2 | T3 | T3a | T4g
-
Compute optimized: C3 | C4 | C5 | C5a | C5n | C6a | C6g | C6gn | C6i | C6in | C7a | C7g | C7gn | C7i | C7i-flex | C8g
-
Memory optimized: R3 | R4 | R5 | R5a | R5b | R5n | R6a | R6g | R6i | R6in | R7a | R7g | R7i | R7iz | R8g | u-3tb1 | u-6tb1 | u-9tb1 | u-12tb1 | u-18tb1 | u-24tb1 | u7i-12tb | u7in-16tb | u7in-24tb | u7in-32tb | X1 | X1e | X2iezn | X8g
-
Accelerated computing: G3 | G3s | G5g | Inf1 | P2 | P3 | VT1
-
High-performance computing: Hpc6a | Hpc7a | Hpc7g
-
Warning
-
Data on instance store volumes will be lost if the instance is stopped. For more information about stopping an instance, see Stopped instances.
-
In the event of a systems status check failure, the instance store and block device mapped data might be lost. For these instance types, you can consider using Enable termination protection.
We recommend that you regularly create backups of valuable data. For information about backup and recovery best practices for Amazon EC2, see Best practices for Amazon EC2.
Configure simplified automatic recovery
Simplified automatic recovery is enabled by default when you launch a supported
instance. You can set the automatic recovery behavior to disabled
during or
after launching the instance. The default
configuration doesn't enable
simplified automatic recovery for an unsupported instance type.
Troubleshooting simplified automatic recovery failures
The following issues can cause the recovery of your instance with simplified automatic recovery to fail:
-
Simplified automatic recovery does not operate during service events in the AWS Health Dashboard. You might not receive recovery failure notifications for such events. For the latest service availability information, see the Service health
status page. -
Temporary, insufficient capacity of replacement hardware.
-
The instance has reached the maximum daily allowance for recovery attempts. Your instance might subsequently be retired if automatic recovery fails and a hardware degradation is determined to be the root cause for the original system status check failure.
If the instance’s system status check failure persists despite multiple recovery attempts, see Troubleshoot instances with failed status checks for additional guidance.