Troubleshoot alert manager with CloudWatch Logs - Amazon Managed Service for Prometheus

Troubleshoot alert manager with CloudWatch Logs

Using Monitor Amazon Managed Service for Prometheus events with CloudWatch Logs, you can troubleshoot Alert Manager and Ruler related issues. This section contains Alert Manager related troubleshooting topics.

Empty content warning

When the log contains the following warning

{ "workspaceId": "ws-abcd1234-ef56-78ab-cd90-1234abcd0000", "message": { "log": "Message has been modified because the content was empty." "level": "WARN" }, "component": "alertmanager" }

This means that the Alert manager template resolved the outbound alert to an empty message.

Action to take

Validate your Alert manager template and ensure that you have a valid template for all receiver pathways.

Non ASCII warning

When the log contains the following warning

{ "workspaceId": "ws-abcd1234-ef56-78ab-cd90-1234abcd0000", "message": { "log": "Subject has been modified because it contains control or non-ASCII characters." "level": "WARN" }, "component": "alertmanager" }

This means that the subject has non-ASCII characters.

Action to take

Remove references in subject field of your template to the labels that might contain non-ASCII characters.

Invalid key/value warning

When the log contains the following warning

{ "workspaceId": "ws-abcd1234-ef56-78ab-cd90-1234abcd0000", "message": { "log": "MessageAttributes has been removed because of invalid key/value, numberOfRemovedAttributes=1" "level": "WARN" }, "component": "alertmanager" }

This means that some of the message attributes have been removed due to keys/values being invalid.

Action to take

Re-evaluate the templates you are using to populate the message attributes, and ensure it is resolving to a valid SNS message attribute. For more information about validating a message to an Amazon SNS topic, see Validating SNS topic

Message limit warning

When the log contains the following warning

{ "workspaceId": "ws-abcd1234-ef56-78ab-cd90-1234abcd0000", "message": { "log": "Message has been truncated because it exceeds size limit, originSize=266K, truncatedSize=12K" "level": "WARN" }, "component": "alertmanager" }

This means that some of the message size is too big.

Action to take

Look at the Alert receiver message template and re-work it to fit within the size limit.

No resource based policy error

When the log contains the following error

{ "workspaceId": "ws-abcd1234-ef56-78ab-cd90-1234abcd0000", "message": { "log": "Notify for alerts failed, AMP is not authorized to perform: SNS:Publish on resource: arn:aws:sns:us-west-2:12345:testSnsReceiver because no resource-based policy allows the SNS:Publish action" "level": "ERROR" }, "component": "alertmanager" }

This means that Amazon Managed Service for Prometheus does not have the permissions to submit the alert to the SNS topic specified.

Action to take

Validate that the access policy on your Amazon SNS topic grants Amazon Managed Service for Prometheus the ability to send SNS messages to the topic. Create an SNS Access Policy giving the service aps.amazonaws.com (Amazon Managed Service for Prometheus) access to your Amazon SNS topic. For more information about SNS Access Policies, see Using the Access Policy Language and Example cases for Amazon SNS access control in the Amazon Simple Notification Service Developer Guide.

Not authorized to call KMS

When the log contains the following AWS KMS error

{ "workspaceId": "ws-abcd1234-ef56-78ab-cd90-1234abcd0000", "message": { "log": "Notify for alerts failed, AMP is not authorized to call KMS", "level": "ERROR" }, "component": "alertmanager" }

Action to take

Validate that the key policy of the key used to encrypt the Amazon SNS topic allows the Amazon Managed Service for Prometheus service principal aps.amazonaws.com to perform the following actions: kms:GenerateDataKey*, and kms:Decrypt. For more information, see AWS KMS Permissions for SNS Topic.