Using CloudWatch Metrics to Monitor Elastic Inference - Amazon Elastic Inference

Using CloudWatch Metrics to Monitor Elastic Inference

You can monitor your Elastic Inference accelerators using Amazon CloudWatch, which collects metrics about your usage and performance. Amazon CloudWatch records these statistics for a period of two weeks. You can access historical information and gain a better perspective of how your service is performing.

By default, Elastic Inference sends metric data to CloudWatch in 5-minute periods.

For more information, see the Amazon CloudWatch User Guide.

Note

Amazon CloudWatch metrics are only emitted when your Elastic Inference accelerator is attached to an Amazon EC2 instance.

Elastic Inference Metrics and Dimensions

The client instance connects to one or more Elastic Inference accelerators through a PrivateLink endpoint. The client instance then inspects the input model’s operators. If there are any operators that cannot run on the Elastic Inference accelerator, the client code partitions the execution graph. Only subgraphs with supported operators are loaded and run on the accelerator. The rest of the subgraphs run on the client instance. In the case of graph partitioning, each inference call on the client instance can result in multiple inference requests on an accelerator. This happens because evaluating each subgraph on the accelerator requires a separate inference call. Some CloudWatch metrics collected on the accelerator give you subgraph metrics and are called out accordingly.

Metrics are grouped first by the service namespace, then by the various dimension combinations within each namespace. You can use the following procedures to view the metrics for Elastic Inference.

To view metrics using the CloudWatch console
  1. Open the CloudWatch console at https://console.aws.amazon.com/cloudwatch/.

  2. If necessary, change the Region. From the navigation bar, select the region where Elastic Inference resides. For more information, see Regions and Endpoints.

  3. In the navigation pane, choose Metrics.

  4. Under All metrics, select a metrics category, and then scroll down to view the full list of metrics.

To view metrics (AWS CLI)
  • At a command prompt, enter the following command:

    aws cloudwatch list-metrics --namespace " AWS/ElasticInference "

CloudWatch displays the following metrics for Elastic Inference.

Metric Description

AcceleratorHealthCheckFailed

Reports whether the Elastic Inference accelerator has passed a status health check in the last minute. A value of zero (0) indicates that the status check passed. A value of one (1) indicates a status check failure.

Units: Count

ConnectivityCheckFailed

Reports whether connectivity to the Elastic Inference accelerator is active or has failed in the last minute. A value of zero (0) indicates that a connection from the client instance was received in the last minute. A value of one (1) indicates that no connection was received from the client instance in the last minute.

Units: Count

AcceleratorMemoryUsage

The memory of the Elastic Inference accelerator used in the last minute.

Units: Bytes

AcceleratorUtilization

The percentage of the Elastic Inference accelerator used for computation in the last minute.

Units: Percent

AcceleratorTotalInferenceCount

The number of inference requests reaching the Elastic Inference accelerator in the last minute. The requests represent the total number of separate calls on all subgraphs on the Elastic Inference accelerator.

Units: Count

AcceleratorSuccessfulInferenceCount

The number of successful inference requests reaching the Elastic Inference accelerator in the last minute. The requests represent the total number of separate calls on all subgraphs on the Elastic Inference accelerator.

Units: Count

AcceleratorInferenceWithClientErrorCount

The number of inference requests reaching the Elastic Inference accelerator in the last minute that resulted in a 4xx error. The requests represent the total number of separate calls on all subgraphs on the Elastic Inference accelerator.

Units: Count

AcceleratorInferenceWithServerErrorCount

The number of inference requests reaching the Elastic Inference accelerator in the last minute that resulted in a 5xx error. The requests represent the total number of separate calls on all subgraphs on the Elastic Inference accelerator.

Units: Count

You can filter the Elastic Inference data using the following dimensions.

Dimension Description

ElasticInferenceAcceleratorId

This dimension filters the data by the Elastic Inference accelerator.

InstanceId

This dimension filters the data by instance to which the Elastic Inference accelerator is attached.

Creating CloudWatch Alarms to Monitor Elastic Inference

You can create a CloudWatch alarm that sends an Amazon SNS message when the alarm changes state. An alarm watches a single metric over a time period that you specify. It sends a notification to an SNS topic based on the value of the metric relative to a given threshold. This takes place over a number of time periods.

For example, you can create an alarm that monitors the health of an Elastic Inference accelerator. It sends a notification when the Elastic Inference accelerator fails a status health check for three consecutive 5-minute periods.

To create an alarm for Elastic Inference accelerator health status
  1. Open the CloudWatch console at https://console.aws.amazon.com/cloudwatch/.

  2. In the navigation pane, choose Alarms, Create Alarm.

  3. Choose Amazon EI Metrics.

  4. Select the Amazon EI and the AcceleratorHealthCheckFailed metric and choose Next.

  5. Configure the alarm as follows, and then choose Create Alarm:

    • Under Alarm Threshold, enter a name and description. For Whenever, choose => and enter 1. For the consecutive periods, enter 3.

    • Under Actions, select an existing notification list or choose New list.

    • Under Alarm Preview, select a period of 5 minutes.