Monitor the Amazon Kinesis Video Streams Edge Agent with CloudWatch
You can monitor the Amazon Kinesis Video Streams Edge Agent using Amazon CloudWatch, which collects and processes raw data into readable, near real-time metrics. These statistics are recorded for a period of 15 months. With this historical information, you can gain a better perspective on how your web application or Amazon Kinesis Video Streams Edge Agent service is performing.
To view the metrics, do the following:
Sign in to the AWS Management Console and open the CloudWatch console at https://console.aws.amazon.com/cloudwatch/
. -
In the left navigation, under Metrics, select All Metrics.
Choose the Browse tab, then select the EdgeRuntimeAgent custom namespace.
Amazon Kinesis Video Streams Edge Agent publishes the following metrics under the namespace EdgeRuntimeAgent
:
Dimensions | State | Description |
---|---|---|
Stream name, |
Running |
Publishes continuously when the Units: None. "1" is published for as long as |
FatalError |
Publishes if a Units: None. "1" is published once, when this event occurs. NoteSee logs for additional information. |
|
Completed |
Publishes when a Units: None. "1" is published once, when this event occurs. |
|
Stream name, |
Running |
Publishes continuously when the Units: None. "1" is published for as long as |
FatalError |
Publishes if the Units: None. "1" is published once, when this event occurs. NoteSee logs for additional information. |
|
Completed |
Publishes when the Units: None. "1" is published once, when this event occurs. |
|
Stream name |
PercentageSpaceUsed |
This is the percentage used out of the total space allocated in Amazon Kinesis Video Streams Edge Agent configurations for recording media. See LocalSizeConfig for more information. Units: Percentage (scale 0–1). |
Thing name |
Alive |
Publishes every minute from the Amazon Kinesis Video Streams Edge Agent, regardless of any configurations running on it. This can be used to understand if the Amazon Kinesis Video Streams Edge Agent is alive and ready to accept configurations. Units: None. "1" is published every minute. |
RecordJobs.HealthyJobCount |
Total count of running and scheduled record jobs on Amazon Kinesis Video Streams Edge Agent. Units: Count. |
|
UploadJobs.HealthyJobCount |
Total count of running and scheduled upload jobs on Amazon Kinesis Video Streams Edge Agent. Units: Count. |
|
RecordJobs.UnhealthyJobCount |
Total count of currently errored record jobs. Units: Count. |
|
UploadJobs.UnhealthyJobCount |
Total count of currently errored upload jobs. Units: Count. |
|
RecordJobs.RunningJobCount |
Total count of actively running record jobs. Units: Count. |
|
UploadJobs.RunningJobCount |
Total count of actively running upload jobs. Units: Count. |
|
RecordJobs.EdgeConfigCount |
Total count of record configurations in process on Amazon Kinesis Video Streams Edge Agent. Units: Count. |
|
UploadJobs.EdgeConfigCount |
Total count of upload configurations in process on Amazon Kinesis Video Streams Edge Agent. Units: Count. |
CloudWatch metrics guidance for Amazon Kinesis Video Streams Edge Agent
CloudWatch metrics can be useful for finding answers to the following questions:
Topics
Does the Amazon Kinesis Video Streams Edge Agent have enough space to record?
Relevant metrics:
PercentageSpaceUsed
Action: No action required.
Is the Amazon Kinesis Video Streams Edge Agent alive?
Relevant metrics:
Alive
Action: If at any point you stop receiving this metric, it means that the Amazon Kinesis Video Streams Edge Agent encountered one or more of the following:
-
An application runtime issue: memory or other resource constraint, bug, and so on
-
The AWS IoT device that the agent is running on shutdown, crashed, or terminated
-
The AWS IoT device doesn't have network connectivity
Are there any unhealthy jobs?
Relevant metrics:
RecordJobs.UnhealthyJobCount
UploadJobs.UnhealthyJobCount
Action: Inspect the logs and look for the FatalError
metric.
If the
FatalError
metric is present, a fatal error was encountered and you need to manually restart the job. Inspect the logs and fix the issue before usingStartEdgeConfigurationUpdate
to manually restart the job.If the
FatalError
metric isn't present, a transient (non-fatal) error was encountered and Amazon Kinesis Video Streams Edge Agent is retrying the job.
Note
To have the agent reattempt a fatally-errored job, use StartEdgeConfigurationUpdate.
Do any jobs need external intervention?
Relevant metrics:
-
PercentageSpaceUsed
– If this exceeds a certain value, the record job is paused and resumes only when space is available (when media goes out of retention). You can send an updated configuration with a higherMaxLocalMediaSizeInMB
to update the job immediately. -
RecordJob.FatalError
/UploadJob.FatalError
– Investigate the agent's logs and send the configuration again for the job to resume.
Action: Make an API call with the configuration to restart jobs that encounter this problem.