Configure CloudWatch action based recovery - Amazon Elastic Compute Cloud

Configure CloudWatch action based recovery

Important
  • The following information applies to configuring recovery-related capabilities on healthy instances. If you are currently encountering difficulties accessing your instance, see Troubleshoot EC2 instances.

  • For your workload to function properly after a successful instance recovery, your instance must boot and accept traffic without requiring manual intervention.

You can configure Amazon CloudWatch action based recovery to add recovery actions to Amazon CloudWatch alarms. CloudWatch action based recovery works with the StatusCheckFailed_System metric. CloudWatch action based recovery provides to-the-minute recovery response time granularity and Amazon Simple Notification Service (Amazon SNS) notifications of recovery actions and outcomes. These configuration options allow for faster recovery attempts with more granular control over the system status check failure event response compared to simplified automatic recovery. For more information about available CloudWatch options, see Status checks for your instances.

Amazon CloudWatch action based recovery doesn't operate during service events in the AWS Health Dashboard. For more information, see Troubleshooting CloudWatch action based recovery failures.

Requirements and limitations for CloudWatch action based recovery

CloudWatch action based recovery can attempt to recover an instance if it:

  • Is in the running state. For more information, see Amazon EC2 instance state changes.

  • Uses default (On-Demand) or dedicated instance tenancy. For more information, see Amazon EC2 billing and purchasing options.

  • Is of an instance type for which Amazon EC2 has capacity available. In some situations, such as significant outages, not enough capacity will be available and some recovery attempts might fail.

  • Doesn't use host instance tenancy. For Amazon EC2 Dedicated Hosts, you can use Dedicated Host Auto Recovery to automatically recover unhealthy instances.

  • Doesn't use an Elastic Fabric Adapter.

  • Isn't a member of an Auto Scaling group.

  • Isn't currently undergoing a scheduled maintenance event.

  • Uses one of the following instance types:

    • General purpose: A1 | M3 | M4 | M5 | M5a | M5n | M5zn | M6a | M6g | M6i | M6in | M7a | M7g | M7i | M7i-flex | M8g | T1 | T2 | T3 | T3a | T4g

    • Compute optimized: C3 | C4 | C5 | C5a | C5n | C6a | C6g | C6gn | C6i | C6in | C7a | C7g | C7gn | C7i | C7i-flex | C8g

    • Memory optimized: R3 | R4 | R5 | R5a | R5b | R5n | R6a | R6g | R6i | R6in | R7a | R7g | R7i | R7iz | R8g | u-3tb1 | u-6tb1 | u-9tb1 | u-12tb1 | u-18tb1 | u-24tb1 | u7i-12tb | u7in-16tb | u7in-24tb | u7in-32tb | X1 | X1e | X2iezn | X8g

    • Accelerated computing: G3 | G3s | G5g | Inf1 | P2 | P3 | VT1

    • High-performance computing: Hpc6a | Hpc7a | Hpc7g

    • Metal instances: Any of the above types with the metal instance size.

  • Has instance store volumes and uses one of the following instance types: M3 | C3 | R3 | X1 | X1e | X2idn | X2iedn

Warning
  • Data on instance store volumes will be lost if the instance is stopped. For more information about stopping an instance, see Stopped instances.

  • In the event of a systems status check failure, the instance store and block device mapped data might be lost. For these instance types, you can consider using Enable termination protection.

We recommend that you regularly create backups of valuable data. For information about backup and recovery best practices for Amazon EC2, see Best practices for Amazon EC2.

You can also use the AWS Management Console or the AWS CLI to view the instance types that support CloudWatch action based recovery.

Console
To view the instance types that support Amazon CloudWatch action based recovery
  1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.

  2. In the left navigation pane, choose Instance Types.

  3. In the filter bar, enter Auto Recovery support: true. Alternatively, as you enter the characters and the filter name appears, you can select it.

    The Instance types table displays all the instance types that support Amazon CloudWatch action based recovery.

AWS CLI
To view the instance types that support Amazon CloudWatch action based recovery

Use the describe-instance-types command.

aws ec2 describe-instance-types --filters Name=auto-recovery-supported,Values=true --query "InstanceTypes[*].[InstanceType]" --output text | sort

Configure CloudWatch action based recovery

CloudWatch action based recovery works with the StatusCheckFailed_System metric. CloudWatch action based recovery is configured through the CloudWatch console. To set up CloudWatch action based recovery, see Adding recover actions to CloudWatch alarms in the Amazon CloudWatch User Guide.

Troubleshooting CloudWatch action based recovery failures

The following issues can cause the recovery of your instance with CloudWatch action based recovery to fail:

  • CloudWatch action based recovery does not operate during service events in the AWS Health Dashboard. You might not receive recovery failure notifications for such events. For the latest service availability information, see the Service health status page.

  • Temporary, insufficient capacity of replacement hardware.

  • The instance has reached the maximum daily allowance for recovery attempts. Your instance might subsequently be retired if automatic recovery fails and a hardware degradation is determined to be the root cause for the original system status check failure.

If the instance’s system status check failure persists despite multiple recovery attempts, see Troubleshoot instances with failed status checks for additional guidance.