Amazon Kinesis Data Analytics
Developer Guide

Viewing Amazon Kinesis Data Analytics Metrics and Dimensions

When your Kinesis Data Analytics for Java application processes a data source, Kinesis Data Analytics reports the following metrics and dimensions to Amazon CloudWatch.

Application Metrics

Metric Unit Description Level Usage Notes
downtime Milliseconds For jobs currently in a failing/recovering situation, the time elapsed during this outage. Application You can use this metric to determine if a job has failed to run. This metric returns 0 for running jobs and -1 for completed jobs. A non-zero value indicates an issue with the application.
lastCheckpointDuration Milliseconds The time it took to complete the last checkpoint Application You can use this metric to determine if the service is taking too long to checkpoint. In some cases, you can troubleshoot this issue by disabling checkpointing.
lastCheckpointSize Bytes The total size of the last checkpoint Application You can use this metric to determine running application storage utilization. Determine the application's storage utilization as follows:
(<lastCheckpointSize> + <application's disk usage>) / (<Number of KPUs> * 50)
numRecordsIn Count The total number of records this operator or task has received. Task
numRecordsInPerSecond Count/Second The total number of records this operator or task has received per second. Task, Operator
numRecordsOut Count The total number of records this operator or task has emitted. Task, Operator You can use this metric to determine the total data sent by the task over time
numRecordsOutPerSecond Count/Second The total number of records this operator or task has emitted per second. Task, Operator
numLateRecordsDropped The number of records this operator or task has dropped due to arriving late. Count Task, Operator
currentInputWatermark The last watermark this operator/tasks has received Milliseconds Task, Operator This record is only emitted for operators or tasks with two inputs. This is the minimum value of the last received watermarks.

Kinesis Data Streams Connector Metrics

AWS emits all records for Kinesis Data Streams in addition to the following:

Metric Unit Description Level Usage Notes
millisBehindLatest Milliseconds The number of milliseconds the consumer is behind the head of the stream, indicating how far behind current time the consumer is. Stream, ShardId
  • A value of 0 indicates that record processing is caught up, and there are no new records to process at this moment. A particular shard's metric can be specified by stream name and shard id.

  • A value of -1 indicates that the service has not yet reported a value for the metric.

bytesRequestedPerFetch The bytes requested in a single call to getRecords. Bytes Stream, ShardId

Viewing CloudWatch Metrics

You can view CloudWatch metrics for your application using the Amazon CloudWatch console or the AWS CLI.

To view metrics using the CloudWatch console

  1. Open the CloudWatch console at https://console.aws.amazon.com/cloudwatch/.

  2. In the navigation pane, choose Metrics.

  3. In the CloudWatch Metrics by Category pane for Amazon Kinesis Data Analytics, choose a metrics category.

  4. In the upper pane, scroll to view the full list of metrics.

To view metrics using the AWS CLI

  • At a command prompt, use the following command.

    aws cloudwatch list-metrics --namespace "AWS/KinesisAnalytics" --region region

Setting CloudWatch Metrics Reporting Levels

You can control the level of application metrics that your application creates. Kinesis Data Analytics for Java Applications supports the following metrics levels:

  • Application: The application only reports the highest level of metrics for each application.

  • Task: The application reports task-specific throughput metrics, such as number of records in and out of the application per second.

  • Operator: The application reports operator-level metrics, such as metrics for each filter or map operation.

  • Parallelism: The application reports Task and Operator level metrics for each execution thread.

The default level is Application. The application reports metrics at the current level and all higher levels. For example, if the reporting level is set to Operator, the application reports Application, Task, and Operator metrics.

You set the CloudWatch metrics reporting level using the MonitoringConfiguration parameter of the CreateApplication action, or the MonitoringConfigurationUpdate parameter of the UpdateApplication action. The following example request for the UpdateApplication action sets the CloudWatch metrics reporting level to Task:

{ "ApplicationName": "MyApplication", "CurrentApplicationVersionId": 4, "ApplicationConfigurationUpdate": { "FlinkApplicationConfigurationUpdate": { "MonitoringConfigurationUpdate": { "ConfigurationTypeUpdate": "CUSTOM", "MetricsLevelUpdate": "TASK" } } } }

You can also configure the logging level using the LogLevel parameter of the CreateApplication action or the LogLevelUpdate parameter of the UpdateApplication action. You can use the following log levels:

  • ERROR: Logs potentially recoverable error events.

  • WARN: Logs warning events that might lead to an error.

  • INFO: Logs informational events.

  • DEBUG: Logs general debugging events.

For more information about Log4j logging levels, see Level in the Apache Log4j documentation.