Menu
Amazon CloudWatch
User Guide

Amazon CloudWatch Concepts

The terminology and concepts that are central to your understanding and use of Amazon CloudWatch are described below.

Metrics

A metric is the fundamental concept in CloudWatch. It represents a time-ordered set of data points that are published to CloudWatch. These data points can be either your custom metrics or metrics from other services in AWS. You can retrieve statistics about those data points as an ordered set of time-series data. Metrics exist only in the region in which they are created. Metrics cannot be deleted, but they automatically expire in 14 days if no new data is published to them.

Think of a metric as a variable to monitor, and the data points represent the values of that variable over time. For example, the CPU usage of a particular Amazon EC2 instance is one metric, and the latency of an Elastic Load Balancing load balancer is another.

The data points themselves can come from any application or business activity from which you collect data, not just Amazon Web Services products and applications. For example, a metric might be the CPU usage of a particular Amazon EC2 instance or the temperature in a refrigeration facility.

Metrics are uniquely defined by a name, a namespace, and one or more dimensions. Each data point has a time stamp, and (optionally) a unit of measure. When you request statistics, the returned data stream is identified by namespace, metric name, dimension, and (optionally) the unit.

You can use the PutMetricData API action (or the aws cloudwatch put-metric-data command) to create a custom metric and publish data points for it. You can add the data points in any order, and at any rate you choose. For more information, see Publish Custom Metrics.

CloudWatch stores your metric data for two weeks. You can publish metric data from multiple sources, such as incoming network traffic from dozens of different Amazon EC2 instances, or requested page views from several different web applications. You can request statistics on metric data points that occur within a specified time window.

Namespaces

CloudWatch namespaces are containers for metrics. Metrics in different namespaces are isolated from each other, so that metrics from different applications are not mistakenly aggregated into the same statistics.

Namespace names are strings that you define when you create a metric. The names must be valid XML characters, typically containing the alphanumeric characters "0-9A-Za-z" plus "."(period), "-" (hyphen), "_" (underscore), "/" (slash), "#" (hash), and ":" (colon). AWS namespaces all follow the convention AWS/<service>, such as AWS/EC2 and AWS/ELB.

Note

Namespace names must be fewer than 256 characters in length.

There is no default namespace. You must specify a namespace for each data element you put into CloudWatch.

Dimensions

A dimension is a name/value pair that helps you to uniquely identify a metric. Every metric has specific characteristics that describe it, and you can think of dimensions as categories for those characteristics. Dimensions help you design a structure for your statistics plan. Because dimensions are part of the unique identifier for a metric, whenever you add a unique name/value pair to one of your metrics, you are creating a new metric.

You specify dimensions when you create a metric with the PutMetricData action (or its command line equivalent put-metric-data). Services in AWS that feed data to CloudWatch also attach dimensions to each metric. You can use dimensions to filter result sets that CloudWatch queries return.

For example, you can get statistics for a specific Amazon EC2 instance by calling GetMetricStatistics with the InstanceID dimension set to a specific Amazon EC2 instance ID.

For metrics produced by certain services such as Amazon EC2, CloudWatch can aggregate data across dimensions. For example, if you call GetMetricStatistics for a metric in the AWS/EC2 namespace and do not specify any dimensions, CloudWatch aggregates all data for the specified metric to create the statistic that you requested. However, CloudWatch does not aggregate across dimensions for metrics that you create with PutMetricData or put-metric-data.

Note

You can assign up to ten dimensions to a metric.

In the figure at the end of this section, the four calls to put-metric-data create four distinct metrics. If you make only those four calls, you could retrieve statistics for these four dimension combinations:

  • Server=Prod,Domain=Frankfurt

  • Server=Prod,Domain=Rio

  • Server=Beta,Domain=Frankfurt

  • Server=Beta,Domain=Rio

You could not retrieve statistics using combinations of dimensions that you did not specifically create. For example, you could not retrieve statistics for any of the following combinations of dimensions unless you create new metrics that specify these combinations with additional calls to put-metric-data:

  • Server=Prod,Domain=<null>

  • Server=<null>,Domain=Frankfurt

  • Server=Beta,Domain=<null>

  • Server=<null>,Domain=Rio

  • Server=Prod

  • Server=Beta

Important

CloudWatch treats each unique combination of dimensions as a separate metric. For example, each call to put-metric-data in the following figure creates a separate metric because each call uses a different set of dimensions. This is true even though all four calls use the same metric name (ServerStats). For information on how this affects pricing, see the Amazon CloudWatch product information page.

Time Stamps

With Amazon CloudWatch, each metric data point must be marked with a time stamp. The time stamp can be up to two weeks in the past and up to two hours into the future. If you do not provide a time stamp, CloudWatch creates a time stamp for you based on the time the data element was received.

The time stamp you use in the request must be a dateTime object, with the complete date plus hours, minutes, and seconds. For more information, see http://www.w3.org/TR/xmlschema-2/#dateTime. For example: 2013-01-31T23:59:59Z. Although it is not required, we recommend that you provide the time stamp in the Coordinated Universal Time (UTC or Greenwich Mean Time) time zone. When you retrieve your statistics from CloudWatch, all times reflect the UTC time zone.

Note

CloudWatch alarms check metrics based on the current time in UTC. Custom metrics sent to CloudWatch with time stamps other than the current UTC time may cause alarms to display Insufficient Data state or result in delayed alarms.

Units

Units represent your statistic's unit of measure. For example, the units for the Amazon EC2 NetworkIn metric are Bytes because NetworkIn tracks the number of bytes that an instance receives on all network interfaces.

You can also specify a unit when you create a custom metric. Units help provide conceptual meaning to your data. Metric data points that specify a unit of measure, such as Percent, are aggregated separately. The following list provides some of the more common units that CloudWatch supports:

  • Seconds

  • Bytes

  • Bits

  • Percent

  • Count

  • Bytes/Second (bytes per second)

  • Bits/Second (bits per second)

  • Count/Second (counts per second)

  • None (default when no unit is specified)

For a complete list of the units that CloudWatch supports, see the MetricDatum data type in the Amazon CloudWatch API Reference.

Though CloudWatch attaches no significance to a unit internally, other applications can derive semantic information based on the unit you choose. When you publish data without specifying a unit, CloudWatch associates it with the None unit. When you get statistics without specifying a unit, CloudWatch aggregates all data points of the same unit together. If you have two otherwise identical metrics with different units, two separate data streams will be returned, one for each unit.

Statistics

Statistics are metric data aggregations over specified periods of time. CloudWatch provides statistics based on the metric data points provided by your custom data or provided by other services in AWS to CloudWatch. Aggregations are made using the namespace, metric name, dimensions, and the data point unit of measure, within the time period you specify. The following table describes the available statistics.

StatisticDescription
Minimum

The lowest value observed during the specified period. You can use this value to determine low volumes of activity for your application.

Maximum

The highest value observed during the specified period. You can use this value to determine high volumes of activity for your application.

Sum

All values submitted for the matching metric added together. This statistic can be useful for determining the total volume of a metric.

Average

The value of Sum / SampleCount during the specified period. By comparing this statistic with the Minimum and Maximum, you can determine the full scope of a metric and how close the average use is to the Minimum and Maximum. This comparison helps you to know when to increase or decrease your resources as needed.

SampleCount

The count (number) of data points used for the statistical calculation.

You use the GetMetricStatistics API action or the get-metric-statistics command to retrieve statistics, specifying the same values that you used for the namespace, metric name, and dimension parameters when the metric values were created. You also specify the start and end times that CloudWatch will use for the aggregation. The starting and ending points can be as close together as 60 seconds, and as far apart as two weeks.

Amazon CloudWatch allows you to add pre-calculated statistics using the PutMetricData API action (or the put-metric-data command) with the StatisticValues (statistic-values) parameter. Instead of data point values, you specify values for SampleCount, Minimum, Maximum, and Sum (CloudWatch calculates the average for you). The values you add in this way are aggregated with any other values associated with the matching metric.

Periods

A period is the length of time associated with a specific Amazon CloudWatch statistic. Each statistic represents an aggregation of the metrics data collected for a specified period of time. Although periods are expressed in seconds, the minimum granularity for a period is one minute. Accordingly, you specify period values as multiples of 60. For example, to specify a period of six minutes, you would use the value 360. You can adjust how the data is aggregated by varying the length of the period. A period can be as short as one minute (60 seconds) or as long as one day (86,400 seconds).

When you call GetMetricStatistics, you can specify the period length with the Period parameter. Two related parameters, StartTime and EndTime, determine the overall length of time associated with the statistics. The default value for the Period parameter is 60 seconds, whereas the default values for StartTime and EndTime give you the last hour's worth of statistics.

The values you select for the StartTime and EndTime parameters determine how many periods GetMetricStatistics will return. For example, calling GetMetricStatistics with the default values for the Period, EndTime, and StartTime parameters returns an aggregated set of statistics for each minute of the previous hour. If you prefer statistics aggregated into ten-minute blocks, set Period to 600. For statistics aggregated over the entire hour, use a Period value of 3600.

Periods are also an important part of the CloudWatch alarms feature. When you create an alarm to monitor a specific metric, you are asking CloudWatch to compare that metric to the threshold value that you supplied. You have extensive control over how CloudWatch makes that comparison. Not only can you specify the period over which the comparison is made, but you can also specify how many evaluation periods are used to arrive at a conclusion. For example, if you specify three evaluation periods, CloudWatch compares a window of three datapoints. CloudWatch only notifies you if the oldest datapoint is breaching and the others are breaching or missing. For metrics that are continuously emitted, CloudWatch won't notify you until three failures are found. For more information about alarms, see Alarms.

Aggregation

Amazon CloudWatch aggregates statistics according to the period length that you specify in calls to GetMetricStatistics. You can publish as many data points as you want with the same or similar time stamps. CloudWatch aggregates them by period length when you get statistics about those data points with GetMetricStatistics. Aggregated statistics are only available when using detailed monitoring. In addition, Amazon CloudWatch does not aggregate data across regions.

You can publish data points for a metric that share not only the same time stamp, but also the same namespace and dimensions. Subsequent calls to GetMetricStatistics returns aggregated statistics about those data points. You can even do this in one PutMetricData request. CloudWatch accepts multiple data points in the same PutMetricData call with the same time stamp. You can also publish multiple data points for the same or different metrics, with any time stamp. The size of a PutMetricData request, however, is limited to 8KB for HTTP GET requests and 40KB for HTTP POST requests. You can include a maximum of 20 data points in one PutMetricData request.

For large data sets that would make the use of PutMetricData impractical, CloudWatch allows for the insertion of a pre-aggregated data set called a StatisticSet. With StatisticSets you give CloudWatch the Min, Max, Sum, and SampleCount of a number of data points. StatisticSets is commonly used when you need to collect data many times in a minute. For example, let’s say you have a metric for the request latency of a web page. It doesn’t make sense to do a PutMetricData request with every web page hit. We suggest you collect the latency of all hits to that web page, aggregate them together once a minute and send that StatisticSet to CloudWatch.

Amazon CloudWatch doesn't differentiate the source of a metric. If you publish a metric with the same namespace and dimensions from different sources, CloudWatch treats this as a single metric. This can be useful for service metrics in a distributed, scaled system. For example, all the hosts in a web server application could publish identical metrics representing the latency of requests they are processing. CloudWatch treats these as a single metric, allowing you to get the statistics for minimum, maximum, average, and sum of all requests across your application.

Alarms

Alarms can automatically initiate actions on your behalf, based on parameters you specify. An alarm watches a single metric over a specified time period, and performs one or more actions based on the value of the metric relative to a given threshold over a number of time periods. The action is a notification sent to an Amazon Simple Notification Service (Amazon SNS) topic or Auto Scaling policy. Alarms invoke actions for sustained state changes only. CloudWatch alarms will not invoke actions simply because they are in a particular state, the state must have changed and been maintained for a specified number of periods. Alarm actions must reside in the same region as the alarm. For example, any Amazon SNS message, Auto Scaling policy, etc. invoked by an alarm must exist in the same region as the alarm and the resource being monitored.

When creating an alarm, select a period that is greater than or equal to the frequency of the metric to be monitored. For example, basic monitoring for Amazon EC2 instances provides metrics every 5 minutes. When setting an alarm on a basic monitoring metric, select a period of at least 300 seconds (5 minutes). Detailed monitoring for Amazon EC2 instances provides metrics every 1 minute; when setting an alarm on a detailed monitoring metric, select a period of at least 60 seconds (1 minute). Alarms exist only in the region in which they are created. Alarm history is available for the last 14 days.

For examples that show you how to set up CloudWatch alarms that invoke an Auto Scaling policy and an Amazon SNS topic, see Creating Amazon CloudWatch Alarms.

Regions

Amazon cloud computing resources are housed in highly available data center facilities. To provide additional scalability and reliability, each data center facility is located in a specific geographical area, known as a region. Regions are large and widely dispersed geographic locations.

Each Amazon region is designed to be completely isolated from the other Amazon regions. This achieves the greatest possible failure isolation and stability, and it makes the locality of each Amazon resource unambiguous. Amazon CloudWatch does not aggregate data across regions. Therefore, metrics are completely separate between regions.

For more information about the endpoints that represent each region, see Regions and Endpoints in the Amazon Web Services General Reference.