Using an AWS managed collector
To use an Amazon Managed Service for Prometheus collector, you must create a scraper that discovers and pulls metrics in your Amazon EKS cluster.
-
You can create a scraper as part of your Amazon EKS cluster creation. For more information about creating an Amazon EKS cluster, including creating a scraper, see Creating an Amazon EKS cluster in the Amazon EKS User Guide.
-
You can create your own scraper, programmatically with the AWS API or by using the AWS CLI.
Note
Amazon Managed Service for Prometheus workspaces created with customer managed keys cannot use AWS managed collectors for ingestion.
An Amazon Managed Service for Prometheus collector scrapes metrics that are Prometheus-compatible. For more information about Prometheus compatible metrics, see What are Prometheus-compatible metrics?.
Note
Scraping metrics from a cluster may incur charges for network usage, for
example, for cross Region traffic. One way to optimize these costs is to configure
your /metrics
endpoint to compress the provided metrics (for example,
with gzip), reducing the data that must be moved across the network. How to do this
depends on the application or library providing the metrics. Some libraries gzip by
default.
The following topics describe how to create, manage, and configure scrapers.
Topics
Create a scraper
An Amazon Managed Service for Prometheus collector consists of a scraper that discovers and collects metrics from an Amazon EKS cluster. Amazon Managed Service for Prometheus manages the scraper for you, giving you the scalability, security, and reliability that you need, without having to manage any instances, agents, or scrapers yourself.
A scraper is automatically created for you when you create an Amazon EKS cluster through the Amazon EKS console. However, in some situations you might want to create a scraper yourself. For example, if you want to add an AWS managed collector to an existing Amazon EKS cluster, or if you want to change the configuration of an existing collector.
You can create a scraper using either the AWS API or the AWS CLI.
There are a few prerequisites for creating your own scraper:
-
You must have an Amazon EKS cluster created.
-
Your Amazon EKS cluster must have cluster endpoint access control set to include private access. It can include private and public, but must include private.
Note
The cluster will be associated with the scraper by its Amazon resource name (ARN). If you delete a cluster, and then create a new one with the same name, the ARN will be reused for the new cluster. Because of this, the scraper will attempt to collect metrics for the new cluster. You delete scrapers separately from deleting the cluster.
The following is a full list of the scraper operations that you can use with the AWS API:
-
Create a scraper with the CreateScraper API operation.
-
List your existing scrapers with the ListScrapers API operation.
-
Delete a scraper with the DeleteScraper API operation.
-
Get more details about a scraper with the DescribeScraper API operation.
-
Get a general purpose configuration for scrapers with the GetDefaultScraperConfiguration API operation.
Note
The Amazon EKS cluster that you are scraping must be configured to allow Amazon Managed Service for Prometheus to access the metrics. The next topic describes how to configure your cluster.
Common errors when creating scrapers
The following are the most common issues when attempting to create a new scraper.
-
Required AWS resources don't exist. The security group, subnet, and Amazon EKS cluster specified must exist.
-
Insufficient IP address space. You must have at least one IP address available in each subnet that you pass into the
CreateScraper
API.
Configuring your Amazon EKS cluster
Your Amazon EKS cluster must be configured to allow the scraper to access metrics. There are two options for this configuration:
-
Use Amazon EKS access entries to automatically provide Amazon Managed Service for Prometheus collectors access to your cluster.
-
Manually configure your Amazon EKS cluster for managed metric scraping.
The following topics describe each of these in more detail.
Configure Amazon EKS for scraper access with access entries
Using access entries for Amazon EKS is the easiest way to give Amazon Managed Service for Prometheus access to scrape metrics from your cluster.
The Amazon EKS cluster that you are scraping must be configured to allow API
authentication. The cluster authentication mode must be set to either
API
or API_AND_CONFIG_MAP
. This is viewable in the
Amazon EKS console on the Access configuration tab of the
cluster details. For more information, see Allowing IAM roles or users
access to Kubernetes object on your Amazon EKS cluster in the Amazon EKS User Guide.
You can create the scraper when creating the cluster, or after creating the cluster:
-
When creating a cluster – You can configure this access when you create an Amazon EKS cluster through the Amazon EKS console (follow the instructions to create a scraper as part of the cluster), and an access entry policy will automatically be created, giving Amazon Managed Service for Prometheus access to the cluster metrics.
-
Adding after a cluster is created – if your Amazon EKS cluster already exists, then set the authentication mode to either
API
orAPI_AND_CONFIG_MAP
, and any scrapers you create through the Amazon Managed Service for Prometheus API or CLI will automatically have the correct access entry policy created for you, and the scrapers will have access to your cluster.
Access entry policy created
When you create a scraper and let Amazon Managed Service for Prometheus generate an access entry policy for you, it generates the following policy. For more information about access entries, see Allowing IAM roles or users access to Kubernetes in the Amazon EKS User Guide.
{ "rules": [ { "effect": "allow", "apiGroups": [ "" ], "resources": [ "nodes", "nodes/proxy", "nodes/metrics", "services", "endpoints", "pods", "ingresses", "configmaps" ], "verbs": [ "get", "list", "watch" ] }, { "effect": "allow", "apiGroups": [ "extensions", "networking.k8s.io" ], "resources": [ "ingresses/status", "ingresses" ], "verbs": [ "get", "list", "watch" ] }, { "effect": "allow", "nonResourceURLs": [ "/metrics" ], "verbs": [ "get" ] } ] }
Manually configuring Amazon EKS for scraper access
If you prefer to use the aws-auth ConfigMap
to control access to
your kubernetes cluster, you can still give Amazon Managed Service for Prometheus scrapers access to your
metrics. The following steps will give Amazon Managed Service for Prometheus access to scrape metrics from
your Amazon EKS cluster.
Note
For more information about ConfigMap
and access entries, see
Allowing IAM roles or users access to
Kubernetes in the Amazon EKS User Guide.
This procedure uses kubectl
and the AWS CLI. For information about installing kubectl
, see
Installing kubectl in the Amazon EKS User
Guide.
To manually configure your Amazon EKS cluster for managed metric scraping
-
Create a file, called
clusterrole-binding.yml
, with the following text:apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: aps-collector-role rules: - apiGroups: [""] resources: ["nodes", "nodes/proxy", "nodes/metrics", "services", "endpoints", "pods", "ingresses", "configmaps"] verbs: ["describe", "get", "list", "watch"] - apiGroups: ["extensions", "networking.k8s.io"] resources: ["ingresses/status", "ingresses"] verbs: ["describe", "get", "list", "watch"] - nonResourceURLs: ["/metrics"] verbs: ["get"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: aps-collector-user-role-binding subjects: - kind: User name: aps-collector-user apiGroup: rbac.authorization.k8s.io roleRef: kind: ClusterRole name: aps-collector-role apiGroup: rbac.authorization.k8s.io
-
Run the following command in your cluster:
kubectl apply -f clusterrole-binding.yml
This will create the cluster role binding and rule. This example uses
aps-collector-role
as the role name, andaps-collector-user
as the user name. -
The following command gives you information about the scraper with the ID
scraper-id
. This is the scraper that you created using the command in the previous section.aws amp describe-scraper --scraper-id
scraper-id
-
From the results of the
describe-scraper
, find theroleArn
.This will have the following format:arn:aws:iam::
account-id
:role/aws-service-role/scraper.aps.amazonaws.com/AWSServiceRoleForAmazonPrometheusScraper_unique-id
Amazon EKS requires a different format for this ARN. You must adjust the format of the returned ARN to be used in the next step. Edit it to match this format:
arn:aws:iam::
account-id
:role/AWSServiceRoleForAmazonPrometheusScraper_unique-id
For example, this ARN:
arn:aws:iam::111122223333:role/aws-service-role/scraper.aps.amazonaws.com/AWSServiceRoleForAmazonPrometheusScraper_1234abcd-56ef-7
Must be rewritten as:
arn:aws:iam::111122223333:role/AWSServiceRoleForAmazonPrometheusScraper_1234abcd-56ef-7
-
Run the following command in your cluster, using the modified
roleArn
from the previous step, as well as your cluster name and region.:eksctl create iamidentitymapping --cluster
cluster-name
--regionregion-id
--arnroleArn
--username aps-collector-userThis allows the scraper to access the cluster using the role and user you created in the
clusterrole-binding.yml
file.
Find and delete scrapers
You can use the AWS API or the AWS CLI to list the scrapers in your account or to delete them.
Note
Make sure that you are using the latest version of the AWS CLI or SDK. The latest version provides you with the latest features and functionality, as well as security updates. Alternatively, use AWS Cloudshell, which provides an always up-to-date command line experience, automatically.
To list all the scrapers in your account, use the ListScrapers API operation.
Alternatively, with the AWS CLI, call:
aws amp list-scrapers
ListScrapers
returns all of the scrapers in your account, for
example:
{ "scrapers": [ { "scraperId": "s-1234abcd-56ef-7890-abcd-1234ef567890", "arn": "arn:aws:aps:us-west-2:123456789012:scraper/s-1234abcd-56ef-7890-abcd-1234ef567890", "roleArn": "arn:aws:iam::123456789012:role/aws-service-role/AWSServiceRoleForAmazonPrometheusScraper_1234abcd-2931", "status": { "statusCode": "DELETING" }, "createdAt": "2023-10-12T15:22:19.014000-07:00", "lastModifiedAt": "2023-10-12T15:55:43.487000-07:00", "tags": {}, "source": { "eksConfiguration": { "clusterArn": "arn:aws:eks:us-west-2:123456789012:cluster/my-cluster", "securityGroupIds": [ "sg-1234abcd5678ef90" ], "subnetIds": [ "subnet-abcd1234ef567890", "subnet-1234abcd5678ab90" ] } }, "destination": { "ampConfiguration": { "workspaceArn": "arn:aws:aps:us-west-2:123456789012:workspace/ws-1234abcd-5678-ef90-ab12-cdef3456a78" } } } ] }
To delete a scraper, find the scraperId
for the scraper that you
want to delete, using the ListScrapers
operation, and then use the
DeleteScraper operation to delete it.
Alternatively, with the AWS CLI, call:
aws amp delete-scraper --scraper-id
scraperId
Scraper configuration
You can control how your scraper discovers and collects metrics with a Prometheus-compatible scraper configuration. For example, you can change the interval that metrics are sent to the workspace. You can also use relabeling to dynamically rewrite the labels of a metric. The scraper configuration is a YAML file that is part of the definition of the scraper.
When a new scraper is created, you specify a configuration by providing a base64
encoded YAML file in the API call. You can download a general purpose configuration
file with the GetDefaultScraperConfiguration
operation in the Amazon Managed Service for Prometheus
API.
To modify the configuration of a scraper, delete the scraper and recreate it with the new configuration.
Supported configuration
For information about the scraper configuration format, including a detailed
breakdown of the possible values, see Configuration<scrape_config>
options describe the most commonly
needed options.
Because Amazon EKS is the only supported service, the only service discovery config
(<*_sd_config>
) supported is the
<kubernetes_sd_config>
.
The complete list of config sections allowed:
-
<global>
-
<scrape_config>
-
<static_config>
-
<relabel_config>
-
<metric_relabel_configs>
-
<kubernetes_sd_config>
Limitations within these sections are listed after the sample configuration file.
Sample configuration file
The following is a sample YAML configuration file with a 30 second scrape interval.
global: scrape_interval: 30s external_labels: clusterArn: apiserver-test-2 scrape_configs: - job_name: pod_exporter kubernetes_sd_configs: - role: pod - job_name: cadvisor scheme: https authorization: type: Bearer credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token kubernetes_sd_configs: - role: node relabel_configs: - action: labelmap regex: __meta_kubernetes_node_label_(.+) - replacement: kubernetes.default.svc:443 target_label: __address__ - source_labels: [__meta_kubernetes_node_name] regex: (.+) target_label: __metrics_path__ replacement: /api/v1/nodes/$1/proxy/metrics/cadvisor # apiserver metrics - scheme: https authorization: type: Bearer credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token job_name: kubernetes-apiservers kubernetes_sd_configs: - role: endpoints relabel_configs: - action: keep regex: default;kubernetes;https source_labels: - __meta_kubernetes_namespace - __meta_kubernetes_service_name - __meta_kubernetes_endpoint_port_name # kube proxy metrics - job_name: kube-proxy honor_labels: true kubernetes_sd_configs: - role: pod relabel_configs: - action: keep source_labels: - __meta_kubernetes_namespace - __meta_kubernetes_pod_name separator: '/' regex: 'kube-system/kube-proxy.+' - source_labels: - __address__ action: replace target_label: __address__ regex: (.+?)(\\:\\d+)? replacement: $1:10249
The following are limitations specific to AWS managed collectors:
-
Scrape interval – The scraper config can't specify a scrape interval of less than 30 seconds.
-
Targets – Targets in the
static_config
must be specified as IP addresses. -
DNS resolution – Related to the target name, the only server name that is recognized in this config is the Kubernetes api server,
kubernetes.default.svc
. All other machines names must be specified by IP address. -
Authorization – Omit if no authorization is needed. If it is needed, the authorization must be
Bearer
, and must point to the file/var/run/secrets/kubernetes.io/serviceaccount/token
. In other words, if used, the authorization section must look like the following:authorization: type: Bearer credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
Note
type: Bearer
is the default, so can be omitted.
Troubleshooting scraper configuration
Amazon Managed Service for Prometheus collectors automatically discover and scrape metrics. But how can you troubleshoot when you don't see a metric you expect to see in your Amazon Managed Service for Prometheus workspace?
The up
metric is a helpful tool. For each endpoint that an Amazon Managed Service for Prometheus
collector discovers, it automatically vends this metric. There are three states of
this metric that can help you to troubleshoot what is happening within the
collector.
-
up
is not present – If there is noup
metric present for an endpoint, then that means that the collector was not able to find the endpoint.If you are sure that the endpoint exists, you likely need to adjust the scrape configuration. The discovery
relabel_config
might need to be adjusted, or it's possible that there is a problem with therole
used for discovery. -
up
is present, but is always 0 – Ifup
is present, but 0, then the collector is able to discover the endpoint, but can't find any Prometheus-compatible metrics.In this case, you might try using a
curl
command against the endpoint directly. You can validate that you have the details correct, for example, the protocol (http
orhttps
), the endpoint, or port that you are using. You can also check that the endpoint is responding with a valid200
response, and follows the Prometheus format. Finally, the body of the response can't be larger than the maximum allowed size. (For limits on AWS managed collectors, see the following section.) -
up
is present and greater than 0 – Ifup
is present, and is greater than 0, then metrics are being sent to Amazon Managed Service for Prometheus.Validate that you are looking for the correct metrics in Amazon Managed Service for Prometheus (or your alternate dashboard, such as Amazon Managed Grafana). You can use curl again to check for expected data in your
/metrics
endpoint. Also check that you haven't exceeded other limits, such as the number of endpoints per scraper. You can check the number of metrics endpoints being scraped by checking the count ofup
metrics, usingcount(up)
.
Scraper limitations
There are few limitations to the fully managed scrapers provided by Amazon Managed Service for Prometheus.
-
Region – Your EKS cluster, managed scraper, and Amazon Managed Service for Prometheus workspace must all be in the same AWS Region.
-
Account – Your EKS cluster, managed scraper, and Amazon Managed Service for Prometheus workspace must all be in the same AWS account.
-
Collectors – You can have a maximum of 10 Amazon Managed Service for Prometheus scrapers per region per account.
Note
You can request an increase to this limit by requesting a quota increase
. -
Metrics response – The body of a response from any one
/metrics
endpoint request cannot be more than 50 megabytes (MB). -
Endpoints per scraper – A scraper can scrape a maximum of 30,000
/metrics
endpoints. -
Scrape interval – The scraper config can't specify a scrape interval of less than 30 seconds.