Setting up the CloudWatch agent to collect cluster metrics
Important
If you are installing Container Insights on on Amazon EKS cluster, we recommend that you use the Amazon CloudWatch Observability EKS add-on for the installation, instead of using the instructions in this section. For more information and instructions, see Install the Amazon CloudWatch Observability EKS add-on.
To set up Container Insights to collect metrics, you can follow the steps in Quick Start setup for Container Insights on Amazon EKS and Kubernetes or you can follow the steps in this section. In the following steps, you set up the CloudWatch agent to be able to collect metrics from your clusters.
If you are installing in an Amazon EKS cluster and you use the instructions in this section on or after November 6, 2023, you install Container Insights with enhanced observability for Amazon EKS in the cluster.
Step 1: Create a namespace for CloudWatch
Use the following step to create a Kubernetes namespace called
amazon-cloudwatch
for CloudWatch. You can skip this step if you have already
created this namespace.
To create a namespace for CloudWatch
-
Enter the following command.
kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/cloudwatch-namespace.yaml
Step 2: Create a service account in the cluster
Use one of the following methods to create a service account for the CloudWatch agent, if you do not already have one.
Use
kubectl
Use a
kubeconfig
file
Use kubectl
for authentication
To use kubectl
to create a service account for the CloudWatch agent
-
Enter the following command.
kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/cwagent/cwagent-serviceaccount.yaml
If you didn't follow the previous steps, but you already have a service account
for the CloudWatch agent that you want to use, you must ensure that it has the following
rules. Additionally, in the rest of the steps in the Container Insights installation,
you must use the name of that service account instead of
cloudwatch-agent
.
rules: - apiGroups: [""] resources: ["pods", "nodes", "endpoints"] verbs: ["list", "watch"] - apiGroups: [ "" ] resources: [ "services" ] verbs: [ "list", "watch" ] - apiGroups: ["apps"] resources: ["replicasets", "daemonsets", "deployments", "statefulsets"] verbs: ["list", "watch"] - apiGroups: ["batch"] resources: ["jobs"] verbs: ["list", "watch"] - apiGroups: [""] resources: ["nodes/proxy"] verbs: ["get"] - apiGroups: [""] resources: ["nodes/stats", "configmaps", "events"] verbs: ["create", "get"] - apiGroups: [""] resources: ["configmaps"] resourceNames: ["cwagent-clusterleader"] verbs: ["get","update"] - nonResourceURLs: ["/metrics"] verbs: ["get", "list", "watch"]
Use kubeconfig
for authentication
Alternatively, you can use a kubeconfig
file for authentication. This method allows you to bypass the need for a
service account b directly specifying the kubeconfig
path in your CloudWatch agent configuration. It also allows you to remove
your dependency on the Kubernetes control plane API for authentication, streamlining your setup and
potentially increasing security by managing authentication through your kubeconfig file.
To use this method, update your CloudWatch agent configuration file to specify the path to your kubeconfig
file, as in the
following example.
{ "logs": { "metrics_collected": { "kubernetes": { "cluster_name": "
YOUR_CLUSTER_NAME
", "enhanced_container_insights": false, "accelerated_compute_metrics": false, "tag_service": false, "kube_config_path": "/path/to/your/kubeconfig" "host_ip": "
HOSTIP
" } } } }
To create a kubeconfig
file, create a Certificate Signing Request (CSR) for the
admin/{create_your_own_user}
user with the system:masters
Kubernetes role.
Then sign with Kubernetes cluster’s Certificate Authority (CA) and create the kubeconfig
file.
Step 3: Create a ConfigMap for the CloudWatch agent
Use the following steps to create a ConfigMap for the CloudWatch agent.
To create a ConfigMap for the CloudWatch agent
-
Download the ConfigMap YAML to your
kubectl
client host by running the following command:curl -O https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/cwagent/cwagent-configmap-enhanced.yaml
-
Edit the downloaded YAML file, as follows:
-
cluster_name – In the
kubernetes
section, replace{{cluster_name}}
with the name of your cluster. Remove the{{}}
characters. Alternatively, if you're using an Amazon EKS cluster, you can delete the"cluster_name"
field and value. If you do, the CloudWatch agent detects the cluster name from the Amazon EC2 tags.
-
-
(Optional) Make further changes to the ConfigMap based on your monitoring requirements, as follows:
-
metrics_collection_interval – In the
kubernetes
section, you can specify how often the agent collects metrics. The default is 60 seconds. The default cadvisor collection interval in kubelet is 15 seconds, so don't set this value to less than 15 seconds. -
endpoint_override – In the
logs
section, you can specify the CloudWatch Logs endpoint if you want to override the default endpoint. You might want to do this if you're publishing from a cluster in a VPC and you want the data to go to a VPC endpoint. -
force_flush_interval – In the
logs
section, you can specify the interval for batching log events before they are published to CloudWatch Logs. The default is 5 seconds. -
region – By default, the agent published metrics to the Region where the worker node is located. To override this, you can add a
region
field in theagent
section: for example,"region":"us-west-2"
. -
statsd section – If you want the CloudWatch Logs agent to also run as a StatsD listener in each worker node of your cluster, you can add a
statsd
section to themetrics
section, as in the following example. For information about other StatsD options for this section, see Retrieve custom metrics with StatsD."metrics": { "metrics_collected": { "statsd": { "service_address":":8125" } } }
A full example of the JSON section is as follows. If you're using a
kubeconfig
file for authentication, add thekube_config_path
parameter to specify the path to your kubeconfig file.{ "agent": { "region": "us-east-1" }, "logs": { "metrics_collected": { "kubernetes": { "cluster_name": "MyCluster", "metrics_collection_interval": 60, "kube_config_path": "
/path/to/your/kubeconfig
" //if using kubeconfig for authentication } }, "force_flush_interval": 5, "endpoint_override": "logs.us-east-1.amazonaws.com" }, "metrics": { "metrics_collected": { "statsd": { "service_address": ":8125" } } } }
-
-
Create the ConfigMap in the cluster by running the following command.
kubectl apply -f cwagent-configmap.yaml
Step 4: Deploy the CloudWatch agent as a DaemonSet
To finish the installation of the CloudWatch agent and begin collecting container metrics, use the following steps.
To deploy the CloudWatch agent as a DaemonSet
-
-
If you do not want to use StatsD on the cluster, enter the following command.
kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/cwagent/cwagent-daemonset.yaml
-
If you do want to use StatsD, follow these steps:
-
Download the DaemonSet YAML to your
kubectl
client host by running the following command.curl -O https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/cwagent/cwagent-daemonset.yaml
-
Uncomment the
port
section in thecwagent-daemonset.yaml
file as in the following:ports: - containerPort: 8125 hostPort: 8125 protocol: UDP
-
Deploy the CloudWatch agent in your cluster by running the following command.
kubectl apply -f cwagent-daemonset.yaml
Deploy the CloudWatch agent on Windows nodes in your cluster by running the following command. The StatsD listener is not supported on the CloudWatch agent on Windows.
kubectl apply -f cwagent-daemonset-windows.yaml
-
-
-
Validate that the agent is deployed by running the following command.
kubectl get pods -n amazon-cloudwatch
When complete, the CloudWatch agent creates a log group named
/aws/containerinsights/
and sends the performance log events to this log group. If you also set up the agent
as a StatsD listener, the agent also listens for StatsD metrics on port 8125 with the
IP address of the node where the application pod is scheduled.Cluster_Name
/performance
Troubleshooting
If the agent doesn't deploy correctly, try the following:
-
Run the following command to get the list of pods.
kubectl get pods -n amazon-cloudwatch
-
Run the following command and check the events at the bottom of the output.
kubectl describe pod
pod-name
-n amazon-cloudwatch -
Run the following command to check the logs.
kubectl logs
pod-name
-n amazon-cloudwatch