Troubleshooting the CloudWatch agent - Amazon CloudWatch

Troubleshooting the CloudWatch agent

Use the following information to help troubleshoot problems with the CloudWatch agent.

CloudWatch agent command line parameters

To see the full list of parameters supported by the CloudWatch agent, enter the following at the command line at a computer where you have it installed:

amazon-cloudwatch-agent-ctl -help

Installing the CloudWatch agent using Run Command fails

To install the CloudWatch agent using Systems Manager Run Command, the SSM Agent on the target server must be version 2.2.93.0 or later. If your SSM Agent isn't the correct version, you might see errors that include the following messages:

no latest version found for package AmazonCloudWatchAgent on platform linux
failed to download installation package reliably

For information about updating your SSM Agent version, see Installing and Configuring SSM Agent in the AWS Systems Manager User Guide.

The CloudWatch agent won't start

If the CloudWatch agent fails to start, there might be an issue in your configuration. Configuration information is logged in the configuration-validation.log file. This file is located in /opt/aws/amazon-cloudwatch-agent/logs/configuration-validation.log on Linux servers and in $Env:ProgramData\Amazon\AmazonCloudWatchAgent\Logs\configuration-validation.log on servers running Windows Server.

Verify that the CloudWatch agent is running

You can query the CloudWatch agent to find whether it's running or stopped. You can use AWS Systems Manager to do this remotely. You can also use the command line, but only to check the local server.

To query the status of the CloudWatch agent using Run Command
  1. Open the Systems Manager console at https://console.aws.amazon.com/systems-manager/.

  2. In the navigation pane, choose Run Command.

    -or-

    If the AWS Systems Manager home page opens, scroll down and choose Explore Run Command.

  3. Choose Run command.

  4. In the Command document list, choose the button next to AmazonCloudWatch-ManageAgent.

  5. In the Action list, choose status.

  6. For Optional Configuration Source choose default and keep Optional Configuration Location blank.

  7. In the Target area, choose the instance to check.

  8. Choose Run.

If the agent is running, the output resembles the following.

{ "status": "running", "starttime": "2017-12-12T18:41:18", "version": "1.73.4" }

If the agent is stopped, the "status" field displays "stopped".

To query the status of the CloudWatch agent locally using the command line
  • On a Linux server, enter the following:

    sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a status

    On a server running Windows Server, enter the following in PowerShell as an administrator:

    & $Env:ProgramFiles\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent-ctl.ps1 -m ec2 -a status

The CloudWatch agent won't start, and the error mentions an Amazon EC2 Region

If the agent doesn't start and the error message mentions an Amazon EC2 Region endpoint, you might have configured the agent to need access to the Amazon EC2 endpoint without granting that access.

For example, if you specify a value for the append_dimensions parameter in the agent configuration file that depends on Amazon EC2 metadata and you use proxies, you must make sure that the server can access the endpoint for Amazon EC2. For more information about these endpoints, see Amazon Elastic Compute Cloud (Amazon EC2) in the Amazon Web Services General Reference.

The CloudWatch agent won't start on Windows Server

On Windows Server, you might see the following error:

Start-Service : Service 'Amazon CloudWatch Agent (AmazonCloudWatchAgent)' cannot be started due to the following error: Cannot start service AmazonCloudWatchAgent on computer '.'. At C:\Program Files\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent-ctl.ps1:113 char:12 + $svc | Start-Service + ~~~~~~~~~~~~~ + CategoryInfo : OpenError: (System.ServiceProcess.ServiceController:ServiceController) [Start-Service], ServiceCommandException + FullyQualifiedErrorId : CouldNotStartService,Microsoft.PowerShell.Commands.StartServiceCommand

To fix this, first make sure that the server service is running. This error can be seen if the agent tries to start when the server service isn't running.

If the server service is already running, the following may be the issue. On some Windows Server installations, the CloudWatch agent takes more than 30 seconds to start. Because Windows Server, by default, allows only 30 seconds for services to start, this causes the agent to fail with an error similar to the following:

To fix this issue, increase the service timeout value. For more information, see A service does not start, and events 7000 and 7011 are logged in the Windows event log.

Where are the metrics?

If the CloudWatch agent has been running but you can't find metrics collected by it in the AWS Management Console or the AWS CLI, confirm that you're using the correct namespace. By default, the namespace for metrics collected by the agent is CWAgent. You can customize this namespace using the namespace field in the metrics section of the agent configuration file. If you don't see the metrics that you expect, check the configuration file to confirm the namespace being used.

When you first download the CloudWatch agent package, the agent configuration file is amazon-cloudwatch-agent.json. This file is in the directory where you ran the configuration wizard, or you might have moved it to a different directory. If you use the configuration wizard, the agent configuration file output from the wizard is named config.json. For more information about the configuration file, including the namespace field, see CloudWatch agent configuration file: Metrics section.

The CloudWatch agent takes a long time to run in a container or logs a hop limit error

When you run the CloudWatch agent as a container service and want to add Amazon EC2 metric dimensions to all metrics collected by the agent, you might see the following errors in version v1.247354.0 of the agent:

2022-06-07T03:36:11Z E! [processors.ec2tagger] ec2tagger: Unable to retrieve Instance Metadata Tags. This plugin must only be used on an EC2 instance. 2022-06-07T03:36:11Z E! [processors.ec2tagger] ec2tagger: Please increase hop limit to 2 by following this document https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-options.html#configuring-IMDS-existing-instances. 2022-06-07T03:36:11Z E! [telegraf] Error running agent: could not initialize processor ec2tagger: EC2MetadataRequestError: failed to get EC2 instance identity document caused by: EC2MetadataError: failed to make EC2Metadata request status code: 401, request id: caused by:

You might see this error if the agent tries to get metadata from IMDSv2 inside a container without an appropriate hop limit. In versions of the agent earlier than v1.247354.0, you can experience this issue without seeing the log message.

To solve this, increase the hop limit to 2 by following the instructions in Configure the instance metadata options.

I updated my agent configuration but don’t see the new metrics or logs in the CloudWatch console

If you update your CloudWatch agent configuration file, the next time that you start the agent, you need to use the fetch-config option. For example, if you stored the updated file on the local computer, enter the following command:

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -s -m ec2 -c file:configuration-file-path

CloudWatch agent files and locations

The following table lists the files installed by and used with the CloudWatch agent, along with their locations on servers running Linux or Windows Server.

File Linux location Windows Server location

The control script that controls starting, stopping, and restarting the agent.

/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl or /usr/bin/amazon-cloudwatch-agent-ctl

$Env:ProgramFiles\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent-ctl.ps1

The log file the agent writes to. You might need to attach this when contacting AWS Support.

/opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log or /var/log/amazon/amazon-cloudwatch-agent/amazon-cloudwatch-agent.log

$Env:ProgramData\Amazon\AmazonCloudWatchAgent\Logs\amazon-cloudwatch-agent.log

Agent configuration validation file.

/opt/aws/amazon-cloudwatch-agent/logs/configuration-validation.log or /var/log/amazon/amazon-cloudwatch-agent/configuration-validation.log

$Env:ProgramData\Amazon\AmazonCloudWatchAgent\Logs\configuration-validation.log

The JSON file used to configure the agent immediately after the wizard creates it. For more information, see Create the CloudWatch agent configuration file.

/opt/aws/amazon-cloudwatch-agent/bin/config.json

$Env:ProgramFiles\Amazon\AmazonCloudWatchAgent\config.json

The JSON file used to configure the agent if this configuration file has been downloaded from Parameter Store.

/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json or /etc/amazon/amazon-cloudwatch-agent/amazon-cloudwatch-agent.json

$Env:ProgramData\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent.json

The TOML file used to specify Region and credential information to be used by the agent, overriding system defaults.

/opt/aws/amazon-cloudwatch-agent/etc/common-config.toml or /etc/amazon/amazon-cloudwatch-agent/common-config.toml

$Env:ProgramData\Amazon\AmazonCloudWatchAgent\common-config.toml

The TOML file that contains the converted contents of the JSON configuration file. The amazon-cloudwatch-agent-ctl script generates this file. Users should not directly modify this file. It can be useful for verifying that JSON to TOML translation was successful.

/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml or /etc/amazon/amazon-cloudwatch-agent/amazon-cloudwatch-agent.toml

$Env:ProgramData\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent.toml

The YAML file that contains the converted contents of the JSON configuration file. The amazon-cloudwatch-agent-ctl script generates this file. You should not directly modify this file. This file can be useful for verifying that the JSON to YAML translation was successful.

/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.yaml or /etc/amazon/amazon-cloudwatch-agent/amazon-cloudwatch-agent.yaml

$Env:ProgramData\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent.yaml

Finding information about CloudWatch agent versions

To find the version number of the CloudWatch agent on a Linux server, enter the following command:

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a status

To find the version number of the CloudWatch agent on Windows Server, enter the following command:

& $Env:ProgramFiles\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent-ctl.ps1 -m ec2 -a status
Note

Using this command is the correct way to find the version of the CloudWatch agent. If you use Programs and Features in the Control Panel, you will see an incorrect version number.

You can also download a README file about the latest changes to the agent, and a file that indicates the version number that is currently available for download. These files are in the follow;ing locations:

  • https://amazoncloudwatch-agent.s3.amazonaws.com/info/latest/RELEASE_NOTES or https://amazoncloudwatch-agent-region.s3.region.amazonaws.com/info/latest/RELEASE_NOTES

  • https://amazoncloudwatch-agent.s3.amazonaws.com/info/latest/CWAGENT_VERSION or https://amazoncloudwatch-agent-region.s3.region.amazonaws.com/amazoncloudwatch-agent-region/info/latest/CWAGENT_VERSION

Logs generated by the CloudWatch agent

The agent generates a log while it runs. This log includes troubleshooting information. This log is the amazon-cloudwatch-agent.log file. This file is located in /opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log on Linux servers and in $Env:ProgramData\Amazon\AmazonCloudWatchAgent\Logs\amazon-cloudwatch-agent.log on servers running Windows Server.

You can configure the agent to log additional details in the amazon-cloudwatch-agent.log file. In the agent configuration file, in the agent section, set the debug field to true, then reconfigure and restart the CloudWatch agent. To disable the logging of this extra information, set the debug field to false. Then, reconfigure and restart the agent. For more information, see Manually create or edit the CloudWatch agent configuration file.

In versions 1.247350.0 and later of the CloudWatch agent, you can optionally set the aws_sdk_log_level field in the agent section of the agent configuration file to one or more of the following options. Separate multiple options with the | character.

  • LogDebug

  • LogDebugWithSigning

  • LogDebugWithHTTPBody

  • LogDebugRequestRetries

  • LogDebugWithEventStreamBody

For more information about these options, see LogLevelType.

Stopping and restarting the CloudWatch agent

You can manually stop the CloudWatch agent using either AWS Systems Manager or the command line.

To stop the CloudWatch agent using Run Command
  1. Open the Systems Manager console at https://console.aws.amazon.com/systems-manager/.

  2. In the navigation pane, choose Run Command.

    -or-

    If the AWS Systems Manager home page opens, scroll down and choose Explore Run Command.

  3. Choose Run command.

  4. In the Command document list, choose AmazonCloudWatch-ManageAgent.

  5. In the Targets area, choose the instance where you installed the CloudWatch agent.

  6. In the Action list, choose stop.

  7. Keep Optional Configuration Source and Optional Configuration Location blank.

  8. Choose Run.

To stop the CloudWatch agent locally using the command line
  • On a Linux server, enter the following:

    sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a stop

    On a server running Windows Server, enter the following in PowerShell as an administrator:

    & $Env:ProgramFiles\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent-ctl.ps1 -m ec2 -a stop

To restart the agent, follow the instructions in Start the CloudWatch agent.