Configure simplified automatic recovery - Amazon Elastic Compute Cloud

Configure simplified automatic recovery

Important
  • The following information applies to configuring recovery-related capabilities on healthy instances. If you are currently encountering difficulties accessing your instance, see Troubleshoot EC2 instances.

  • For your workload to function properly after a successful instance recovery, your instance must boot and accept traffic without requiring manual intervention.

By default, simplified automatic recovery monitors all supported running instances. In the event that a system status check failure is detected, simplified automatic recovery attempts to remediate the instance to a healthy state. Simplified automatic recovery doesn't operate during service events in the AWS Health Dashboard. For more information, see Troubleshooting simplified automatic recovery failures.

When a simplified automatic recovery event occurs, you will receive an AWS Health Dashboard event. To configure notifications for these events, see the Getting Started with AWS User Notifications in the AWS User Notifications User Guide. You can also use Amazon EventBridge rules to monitor for simplified automatic recovery events using the following event codes:

  • AWS_EC2_SIMPLIFIED_AUTO_RECOVERY_SUCCESS — successful events

  • AWS_EC2_SIMPLIFIED_AUTO_RECOVERY_FAILURE — failed events

For more information, see Amazon EventBridge rules.

Requirements and limitations for simplified automatic recovery

Simplified automatic recovery will attempt to recover an instance if it:

  • Is in the running state. For more information, see Amazon EC2 instance state changes.

  • Uses default (On-Demand) or dedicated tenancy. For more information, see Amazon EC2 billing and purchasing options.

  • Is of an instance type for which Amazon EC2 has capacity available. In some situations, such as significant outages, not enough capacity will be available and some recovery attempts might fail.

  • Doesn't use host tenancy. For Amazon EC2 Dedicated Hosts, you can use Dedicated Host Auto Recovery to automatically recover unhealthy instances.

  • Doesn't use an Elastic Fabric Adapter.

  • Isn't a metal instance size.

  • Isn't a member of an Auto Scaling group.

  • Isn't currently undergoing a scheduled maintenance event.

  • Doesn't have instance store volumes.

  • Uses one of the following instance types:

    • General purpose: A1 | M3 | M4 | M5 | M5a | M5n | M5zn | M6a | M6g | M6i | M6in | M7a | M7g | M7i | M7i-flex | M8g | T1 | T2 | T3 | T3a | T4g

    • Compute optimized: C3 | C4 | C5 | C5a | C5n | C6a | C6g | C6gn | C6i | C6in | C7a | C7g | C7gn | C7i | C7i-flex | C8g

    • Memory optimized: R3 | R4 | R5 | R5a | R5b | R5n | R6a | R6g | R6i | R6in | R7a | R7g | R7i | R7iz | R8g | u-3tb1 | u-6tb1 | u-9tb1 | u-12tb1 | u-18tb1 | u-24tb1 | u7i-12tb | u7in-16tb | u7in-24tb | u7in-32tb | X1 | X1e | X2iezn | X8g

    • Accelerated computing: G3 | G3s | G5g | Inf1 | P2 | P3 | VT1

    • High-performance computing: Hpc6a | Hpc7a | Hpc7g

Warning
  • Data on instance store volumes will be lost if the instance is stopped. For more information about stopping an instance, see Stopped instances.

  • In the event of a systems status check failure, the instance store and block device mapped data might be lost. For these instance types, you can consider using Enable termination protection.

We recommend that you regularly create backups of valuable data. For information about backup and recovery best practices for Amazon EC2, see Best practices for Amazon EC2.

Configure simplified automatic recovery

Simplified automatic recovery is enabled by default when you launch a supported instance. You can set the automatic recovery behavior to disabled during or after launching the instance. The default configuration doesn't enable simplified automatic recovery for an unsupported instance type.

Console
To disable simplified automatic recovery during instance launch
  1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.

  2. In the navigation pane, choose Instances, and then choose Launch instance.

  3. In the Advanced details section, for Instance auto-recovery, select Disabled.

  4. Configure the remaining instance launch settings as needed and then launch the instance.

To disable simplified automatic recovery for a running or stopped instance
  1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.

  2. In the navigation pane, choose Instances.

  3. Select the instance, and then choose Actions, Instance settings, Change auto-recovery behavior.

  4. Choose Off, and then choose Save.

To set the automatic recovery behavior to default for a running or stopped instance
  1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.

  2. In the navigation pane, choose Instances.

  3. Select the instance, and then choose Actions, Instance settings, Change auto-recovery behavior.

  4. Choose Default (On), and then choose Save.

AWS CLI
To disable simplified automatic recovery at launch

Use the run-instances command.

aws ec2 run-instances \ --image-id ami-1a2b3c4d \ --instance-type t2.micro \ --key-name MyKeyPair \ --maintenance-options AutoRecovery=Disabled \ [...]
To disable simplified automatic recovery for a running or stopped instance

Use the modify-instance-maintenance-options command.

aws ec2 modify-instance-maintenance-options \ --instance-id i-0abcdef1234567890 \ --auto-recovery disabled
To set the automatic recovery behavior to default for a running or stopped instance

Use the modify-instance-maintenance-options command.

aws ec2 modify-instance-maintenance-options \ --instance-id i-0abcdef1234567890 \ --auto-recovery default

Troubleshooting simplified automatic recovery failures

The following issues can cause the recovery of your instance with simplified automatic recovery to fail:

  • Simplified automatic recovery does not operate during service events in the AWS Health Dashboard. You might not receive recovery failure notifications for such events. For the latest service availability information, see the Service health status page.

  • Temporary, insufficient capacity of replacement hardware.

  • The instance has reached the maximum daily allowance for recovery attempts. Your instance might subsequently be retired if automatic recovery fails and a hardware degradation is determined to be the root cause for the original system status check failure.

If the instance’s system status check failure persists despite multiple recovery attempts, see Troubleshoot instances with failed status checks for additional guidance.