System-level CloudWatch configuration - AWS Prescriptive Guidance

System-level CloudWatch configuration

Systems-level metrics and logs are a central component of a monitoring and logging solution, and the CloudWatch agent has specific configuration options for Windows and Linux.

We recommend that you use the CloudWatch configuration file wizard or configuration file schema to define the CloudWatch agent configuration file for each OS that you plan to support. Additional workload-specific, OS-level logs and metrics can be defined in separate CloudWatch configuration files and appended to the standard configuration. These unique configuration files should be separately stored in an S3 bucket where they can be retrieved by your EC2 instances. An example of an S3 bucket setup for this purpose is described in the Storing CloudWatch configuration files in an S3 bucket section of this guide. You can automatically retrieve and apply these configurations using State Manager and Distributor.

Configuring system-level logs

System-level logs are essential for diagnosing and troubleshooting issues on premises or on the AWS Cloud. Your log capture approach should include any system and security logs generated by the OS. The OS-generated log files might be different depending on the OS version.

The CloudWatch agent supports monitoring Windows event logs by providing the event log name. You can choose which Windows event logs you want to monitor (for example System, Application, or Security).

The system, application, and security logs for Linux systems are typically stored in the /var/log directory. The following table defines the common default log files that you should monitor, but you should check the /etc/rsyslog.conf or /etc/syslog.conf file to determine the specific setup for your system's log files.

Fedora distribution

(Amazon Linux, CentOS, Red Hat Enterprise Linux)

/var/log/boot.log* – Bootup log

/var/log/dmesg – Kernel log

/var/log/secure – Security and authentication log

/var/log/messages – General system log

/var/log/cron* – Cron Logs

/var/log/cloud-init-output.log – Output from Userdata startup scripts

Debian

(Ubuntu)

/var/log/syslog – Bootup log

/var/log/cloud-init-output.log – Output from Userdata startup scripts

/var/log/auth.log – Security and authentication log

/var/log/kern.log – Kernel log

Your organization might also have other agents or system components that generate logs you want to monitor. You should evaluate and decide which log files are generated by these agents or applications, and include them in your configuration by identifying their file location. For example, you should include the Systems Manager and CloudWatch agent logs in your configuration. The following table provides the location of these agent logs for Windows and Linux.

Windows CloudWatch agent

$Env:ProgramData\Amazon\AmazonCloudWatchAgent\Logs\amazon-cloudwatch-agent.log

Systems Manager agent

%PROGRAMDATA%\Amazon\SSM\Logs\amazon-ssm-agent.log

%PROGRAMDATA%\Amazon\SSM\Logs\errors.log

%PROGRAMDATA%\Amazon\SSM\Logs\audits\amazon-ssm-agent-audit-YYYY-MM-DD

Linux CloudWatch agent

/opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log

Systems Manager agent

/var/log/amazon/ssm/amazon-ssm-agent.log

/var/log/amazon/ssm/errors.log

/var/log/amazon/ssm/audits/amazon-ssm-agent-audit-YYYY-MM-DD

CloudWatch ignores a log file if the log file is defined in the CloudWatch agent configuration but isn’t found. This is useful when you want to maintain a single log configuration for Linux, instead of separate configurations for each distribution. It is also useful when a log file doesn’t exist until the agent or software application starts running.

Configuring system-level metrics

Memory and disk space utilization aren't included in standard metrics provided by Amazon EC2. To include these metrics, you must install and configure the CloudWatch agent on your EC2 instances. The CloudWatch agent configuration wizard creates a CloudWatch configuration with predefined metrics and you can add or remove metrics as required. Make sure that you review the predefined metric sets to determine the appropriate level that you require.

End users and workload owners should publish additional system metrics based on specific requirements for a server or EC2 instance. These metric definitions should be stored, versioned, and maintained in a separate CloudWatch agent configuration file, and shared in a central location (for example, Amazon S3) for reuse and automation.

Standard Amazon EC2 metrics are not automatically captured in on-premises servers. These metrics must be defined in a CloudWatch agent configuration file used by the on-premises instances. You can create a separate metric configuration file for on-premises instances with metrics such as CPU utilization, and have these metrics appended to the standard metrics configuration file.