Menu
Amazon CloudWatch
User Guide

Creating Amazon CloudWatch Alarms

You can create a CloudWatch alarm that watches a single metric. The alarm performs one or more actions based on the value of the metric relative to a threshold over a number of time periods. The action can be an Amazon EC2 action, an Auto Scaling action, or a notification sent to an Amazon SNS topic.

Alarms invoke actions for sustained state changes only. CloudWatch alarms do not invoke actions simply because they are in a particular state, the state must have changed and been maintained for a specified number of periods.

After an alarm invokes an action due to a change in state, its subsequent behavior depends on the type of action that you have associated with the alarm. For Amazon EC2 and Auto Scaling actions, the alarm continues to invoke the action for every period that the alarm remains in the new state. For Amazon SNS notifications, no additional actions are invoked.

Note

CloudWatch doesn't test or validate the actions that you specify, nor does it detect any Auto Scaling or Amazon SNS errors resulting from an attempt to invoke nonexistent actions. Make sure that your actions exist.

You can also add alarms to dashboards. When an alarm is on a dashboard, it turns red when it is in the ALARM state, making it easier for you to monitor its status proactively.

An alarm has three possible states:

  • OK—The metric is within the defined threshold

  • ALARM—The metric is outside of the defined threshold

  • INSUFFICIENT_DATA—The alarm has just started, the metric is not available, or not enough data is available for the metric to determine the alarm state

In the following figure, the alarm threshold is set to 3 units and the alarm is evaluated over 3 periods. That is, the alarm goes to ALARM state if the oldest of the 3 periods being evaluated is breaching, and the 2 subsequent periods are either breaching or missing. In the figure, this happens with the third through fifth time periods, and the alarm's state is set to ALARM. At period six, the value dips below the threshold, and the state reverts to OK. Later, during the ninth time period, the threshold is breached again, but for only one period. Consequently, the alarm state remains OK.


        Alarm threshold trigger alarm

Configuring How CloudWatch Alarms Treats Missing Data

Similar to how each alarm is always in one of three states, each specific data point reported to CloudWatch falls under one of three categories::

  • good (within the threshold)

  • bad (violating the threshold)

  • missing

You can specify how alarms handle missing data points. Choose whether to treat missing data points as:

  • Missing (the alarm looks back farther in time to find additional data points)

  • Good ("Not Breaching," treated as a data point that is within the threshold)

  • Bad ("Breaching," treated as a data point that is breaching the threshold)

  • Ignored (the current alarm state is maintained)

The best choice depends on the type of metric. For a metric that continually reports data, such as CPUUtilization of an instance, you might want to treat missing data points as bad, because they may indicate something is wrong. But for a metric that generates data points only when an error occurs, such as ThrottledRequests in Amazon DynamoDB, you would want to treat missing data as good. The default behavior is missing.

Choosing the best option for your alarm prevents unnecessary and misleading alarm condition changes, and also more accurately indicates the health of your system.

Note

If you treat missing data as missing and some data points in the current window are missing, CloudWatch looks back extra periods to find other existing data points to assess whether the alarm should change state. When this happens, if the furthest back period that is now being considered is not breaching, the alarm state does not go to ALARM.

High-Resolution Alarms

If you set an alarm on a high-resolution metric, you can specify a high-resolution alarm with a period of 10 seconds or 30 seconds, or you can set a regular alarm with a period of any multiple of 60 seconds. There is a higher charge for high-resolution alarms. For more information about high-resolution metrics, see Publish Custom Metrics.

Percentile-Based CloudWatch Alarms and Low Data Samples

When you set a percentile as the statistic for an alarm, you can specify what to do when there is not enough data for a good statistical assessment. You can choose to have the alarm evaluate the statistic anyway and possibly change the alarm state. Or, you can have the alarm ignore the metric while the sample size is low, and wait to evaluate it until there is enough data to be statistically significant.

For percentiles between 0.5 and 1.00, this setting is used when there are fewer than 10/(1-percentile) data points during the evaluation period. For example, this setting would be used if there were fewer than 1000 samples for an alarm on a p99 percentile. For percentiles between 0 and 0.5, the setting is used when there are fewer than 10/percentile data points.

Common Features of CloudWatch Alarms

The following features apply to all CloudWatch alarms:

  • You can create up to 5000 alarms per region per AWS account. To create or update an alarm, you use the PutMetricAlarm API action (mon-put-metric-alarm command).

  • Alarm names must contain only ASCII characters.

  • You can list any or all of the currently configured alarms, and list any alarms in a particular state using DescribeAlarms (mon-describe-alarms). You can further filter the list by time range.

  • You can disable and enable alarms by using DisableAlarmActions and EnableAlarmActions (mon-disable-alarm-actions and mon-enable-alarm-actions).

  • You can test an alarm by setting it to any state using SetAlarmState (mon-set-alarm-state). This temporary state change lasts only until the next alarm comparison occurs.

  • You can create an alarm using PutMetricAlarm (mon-put-metric-alarm) before you've created a custom metric. For the alarm to be valid, you must include all of the dimensions for the custom metric in addition to the metric namespace and metric name in the alarm definition.

  • You can view an alarm's history using DescribeAlarmHistory (mon-describe-alarm-history). CloudWatch preserves alarm history for two weeks. Each state transition is marked with a unique time stamp. In rare cases, your history might show more than one notification for a state change. The time stamp enables you to confirm unique state changes.

  • The number of evaluation periods for an alarm multiplied by the length of each evaluation period cannot exceed one day.

Note

Some AWS resources do not send metric data to CloudWatch under certain conditions.

For example, Amazon EBS may not send metric data for an available volume that is not attached to an Amazon EC2 instance, because there is no metric activity to be monitored for that volume. If you have an alarm set for such a metric, you may notice its state change to Insufficient Data. This may simply indicate that your resource is inactive, and may not necessarily mean that there is a problem.