Using an AWS managed collector - Amazon Managed Service for Prometheus

Using an AWS managed collector

To use an Amazon Managed Service for Prometheus collector, you must create a scraper that discovers and pulls metrics in your Amazon EKS cluster.

  • You can create a scraper as part of your Amazon EKS cluster creation. For more information about creating an Amazon EKS cluster, including creating a scraper, see Creating an Amazon EKS cluster in the Amazon EKS User Guide.

  • You can create your own scraper, programmatically with the AWS API or by using the AWS CLI.

Note

Amazon Managed Service for Prometheus workspaces created with customer managed keys cannot use AWS managed collectors for ingestion.

An Amazon Managed Service for Prometheus collector scrapes metrics that are Prometheus-compatible. For more information about Prometheus compatible metrics, see What are Prometheus-compatible metrics?.

The following topics describe how to create, manage, and configure scrapers.

Create a scraper

An Amazon Managed Service for Prometheus collector consists of a scraper that discovers and collects metrics from an Amazon EKS cluster. Amazon Managed Service for Prometheus manages the scraper for you, giving you the scalability, security, and reliability that you need, without having to manage any instances, agents, or scrapers yourself.

A scraper is automatically created for you when you create an Amazon EKS cluster through the Amazon EKS console. However, in some situations you might want to create a scraper yourself. For example, if you want to add an AWS managed collector to an existing Amazon EKS cluster, or if you want to change the configuration of an existing collector.

You can create a scraper using either the AWS API or the AWS CLI.

There are a few prerequisites for creating your own scraper:

  • You must have an Amazon EKS cluster created.

  • Your Amazon EKS cluster must have cluster endpoint access control set to include private access. It can include private and public, but must include private.

To create a scraper using the AWS API

Use the CreateScraper API operation to create a scraper with the AWS API. The following example creates a scraper in the us-west-2 Region. You need to replace the AWS account, workspace, security, and Amazon EKS cluster information with your own IDs, and provide the configuration to use for your scraper.

Note

You must include at least two subnets, in at least two availability zones.

The scrapeConfiguration is a Prometheus configuration YAML file that is base64 encoded. You can download a general purpose configuration with the GetDefaultScraperConfiguration API operation. The next section contains more details about the format of the scrapeConfiguration.

POST /scrapers HTTP/1.1 Content-Length: 415 Authorization: AUTHPARAMS X-Amz-Date: 20201201T193725Z User-Agent: aws-cli/1.18.147 Python/2.7.18 Linux/5.4.58-37.125.amzn2int.x86_64 botocore/1.18.6 { "alias": "myScraper", "destination": { "ampConfiguration": { "workspaceArn": "arn:aws:aps:us-west-2:account-id:workspace/ws-workspace-id" } }, "source": { "eksConfiguration": { "clusterArn": "arn:aws:eks:us-west-2:account-id:cluster/cluster-name", "securityGroupIds": ["sg-security-group-id"], "subnetIds": ["subnet-subnet-id-1", "subnet-subnet-id-2"] } }, "scrapeConfiguration": { "configurationBlob": <base64-encoded-blob> } }

To create a scraper using the AWS CLI

Use the create-scraper command to create a scraper in the us-west-2 Region. As in the API example, you must replace the information needed with information from your own account.

aws amp create-scraper \ --source eksConfiguration="{clusterArn='arn:aws:eks:us-west-2:account-id:cluster/cluster-name', securityGroupIds=['sg-security-group-id'],subnetIds=['subnet-subnet-id-1', 'subnet-subnet-id-2']}" \ --scrape-configuration configurationBlob=<base64-encoded-blob> \ --destination ampConfiguration="{workspaceArn='arn:aws:aps:us-west-2:account-id:workspace/ws-workspace-id'}"

The following is a full list of the scraper operations that you can use with the AWS API:

Note

The Amazon EKS cluster that you are scraping must be configured to allow Amazon Managed Service for Prometheus to access the metrics. The next topic describes how to configure your cluster.

Configuring your Amazon EKS cluster

Your Amazon EKS cluster must be configured to allow the scraper to access metrics. The following steps will allow access. This procedure uses kubectl and the AWS CLI. For information about installing kubectl, see Installing kubectl in the Amazon EKS User Guide.

To configure your Amazon EKS cluster for managed metric scraping
  1. Create a file, called clusterrole-binding.yml, with the following text:

    apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: aps-collector-role rules: - apiGroups: [""] resources: ["nodes", "nodes/proxy", "nodes/metrics", "services", "endpoints", "pods", "ingresses", "configmaps"] verbs: ["describe", "get", "list", "watch"] - apiGroups: ["extensions", "networking.k8s.io"] resources: ["ingresses/status", "ingresses"] verbs: ["describe", "get", "list", "watch"] - nonResourceURLs: ["/metrics"] verbs: ["get"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: aps-collector-user-role-binding subjects: - kind: User name: aps-collector-user apiGroup: rbac.authorization.k8s.io roleRef: kind: ClusterRole name: aps-collector-role apiGroup: rbac.authorization.k8s.io
  2. Run the following command in your cluster:

    kubectl apply -f clusterrole-binding.yml

    This will create the cluster role binding and rule. This example uses aps-collector-role as the role name, and aps-collector-user as the user name.

  3. The following command gives you information about the scraper with the ID scraper-id. This is the scraper that you created using the command in the previous section.

    aws amp describe-scraper --scraper-id scraper-id
  4. From the results of the describe-scraper, find the roleArn.This will have the following format:

    arn:aws:iam::account-id:role/aws-service-role/scraper.aps.amazonaws.com/AWSServiceRoleForAmazonPrometheusScraper_unique-id

    Amazon EKS requires a different format for this ARN. You must adjust the format of the returned ARN to be used in the next step. Edit it to match this format:

    arn:aws:iam::account-id:role/AWSServiceRoleForAmazonPrometheusScraper_unique-id

    For example, this ARN:

    arn:aws:iam::111122223333:role/aws-service-role/scraper.aps.amazonaws.com/AWSServiceRoleForAmazonPrometheusScraper_1234abcd-56ef-7

    Must be rewritten as:

    arn:aws:iam::111122223333:role/AWSServiceRoleForAmazonPrometheusScraper_1234abcd-56ef-7
  5. Run the following command in your cluster, using the modified roleArn from the previous step, as well as your cluster name and region.:

    eksctl create iamidentitymapping --cluster cluster-name --region region-id --arn roleArn --username aps-collector-user

    This allows the scraper to access the cluster using the role and user you created in the clusterrole-binding.yml file.

Find and delete scrapers

You can use the AWS API or the AWS CLI to list the scrapers in your account or to delete them.

To list all the scrapers in your account, use the ListScrapers API operation.

Alternatively, with the AWS CLI, call:

aws amp list-scrapers

ListScrapers returns all of the scrapers in your account, for example:

{ "scrapers": [ { "scraperId": "s-1234abcd-56ef-7890-abcd-1234ef567890", "arn": "arn:aws:aps:us-west-2:123456789012:scraper/s-1234abcd-56ef-7890-abcd-1234ef567890", "roleArn": "arn:aws:iam::123456789012:role/aws-service-role/AWSServiceRoleForAmazonPrometheusScraper_1234abcd-2931", "status": { "statusCode": "DELETING" }, "createdAt": "2023-10-12T15:22:19.014000-07:00", "lastModifiedAt": "2023-10-12T15:55:43.487000-07:00", "tags": {}, "source": { "eksConfiguration": { "clusterArn": "arn:aws:eks:us-west-2:123456789012:cluster/my-cluster", "securityGroupIds": [ "sg-1234abcd5678ef90" ], "subnetIds": [ "subnet-abcd1234ef567890", "subnet-1234abcd5678ab90" ] } }, "destination": { "ampConfiguration": { "workspaceArn": "arn:aws:aps:us-west-2:123456789012:workspace/ws-1234abcd-5678-ef90-ab12-cdef3456a78" } } } ] }

To delete a scraper, find the scraperId for the scraper that you want to delete, using the ListScrapers operation, and then use the DeleteScraper operation to delete it.

Alternatively, with the AWS CLI, call:

aws amp delete-scraper --scraper-id scraperId

Scraper configuration

You can control how your scraper discovers and collects metrics with a Prometheus-compatible scraper configuration. For example, you can change the interval that metrics are sent to the workspace. You can also use relabeling to dynamically rewrite the labels of a metric. The scraper configuration is a YAML file that is part of the definition of the scraper.

For more information about the scraper configuration format, including a detailed breakdown of the possible values, see Configuration in the Prometheus documentation. The global configuration options, and <scrape_config> options describe the most commonly needed options.

When a new scraper is created, you specify a configuration by providing a base64 encoded YAML file in the API call. You can download a general purpose configuration file with the GetDefaultScraperConfiguration operation in the Amazon Managed Service for Prometheus API.

To modify the configuration of a scraper, delete the scraper and recreate it with the new configuration.

Sample configuration file

The following is a sample YAML configuration file with a 30 second scrape interval.

global: scrape_interval: 30s external_labels: clusterArn: apiserver-test-2 scrape_configs: - job_name: pod_exporter kubernetes_sd_configs: - role: pod - job_name: cadvisor scheme: https authorization: credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token kubernetes_sd_configs: - role: node relabel_configs: - action: labelmap regex: __meta_kubernetes_node_label_(.+) - replacement: kubernetes.default.svc:443 target_label: __address__ - source_labels: [__meta_kubernetes_node_name] regex: (.+) target_label: __metrics_path__ replacement: /api/v1/nodes/$1/proxy/metrics/cadvisor # apiserver metrics - scheme: https authorization: credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token job_name: kubernetes-apiservers kubernetes_sd_configs: - role: endpoints relabel_configs: - action: keep regex: default;kubernetes;https source_labels: - __meta_kubernetes_namespace - __meta_kubernetes_service_name - __meta_kubernetes_endpoint_port_name # kube proxy metrics - job_name: kube-proxy honor_labels: true kubernetes_sd_configs: - role: pod relabel_configs: - action: keep source_labels: - __meta_kubernetes_namespace - __meta_kubernetes_pod_name separator: '/' regex: 'kube-system/kube-proxy.+' - source_labels: - __address__ action: replace target_label: __address__ regex: (.+?)(\\:\\d+)? replacement: $1:10249

There are two limitations specific to AWS managed collectors:

  • Scrape interval – The scraper config can't specify a scrape interval of less than 30 seconds.

  • Targets – Targets in the static_config must be specified as IP addresses.

Troubleshooting scraper configuration

Amazon Managed Service for Prometheus collectors automatically discover and scrape metrics. But how can you troubleshoot when you don't see a metric you expect to see in your Amazon Managed Service for Prometheus workspace?

The up metric is a helpful tool. For each endpoint that an Amazon Managed Service for Prometheus collector discovers, it automatically vends this metric. There are three states of this metric that can help you to troubleshoot what is happening within the collector.

  • up is not present – If there is no up metric present for an endpoint, then that means that the collector was not able to find the endpoint.

    If you are sure that the endpoint exists, you likely need to adjust the scrape configuration. The discovery relabel_config might need to be adjusted, or it's possible that there is a problem with the role used for discovery.

  • up is present, but is always 0 – If up is present, but 0, then the collector is able to discover the endpoint, but can't find any Prometheus-compatible metrics.

    In this case, you might try using a curl command against the endpoint directly. You can validate that you have the details correct, for example, the protocol (http or https, the endpoint, or port that you are using. You can also check that the endpoint is responding with a valid 200 response, and follows the Prometheus format. Finally, the body of the response can't be larger than the maximum allowed size. (For limits on AWS managed collectors, see the following section.)

  • up is present and greater than 0 – If up is present, and is greater than 0, then metrics are being sent to Amazon Managed Service for Prometheus.

    Validate that you are looking for the correct metrics in Amazon Managed Service for Prometheus (or your alternate dashboard, such as Amazon Managed Grafana). You can use curl again to check for expected data in your /metrics endpoint. Also check that you haven't exceeded other limits, such as the number of endpoints per scraper.

Scraper limitations

There are few limitations to the fully managed scrapers provided by Amazon Managed Service for Prometheus.

  • Region – Your EKS cluster, managed scraper, and Amazon Managed Service for Prometheus workspace must all be in the same AWS Region.

  • Account – Your EKS cluster, managed scraper, and Amazon Managed Service for Prometheus workspace must all be in the same AWS account.

  • Collectors – You can have a maximum of 10 Amazon Managed Service for Prometheus scrapers per region per account.

    Note

    You can request an increase to this limit by requesting a quota increase.

  • Metrics response – The body of a response from any one /metrics endpoint request cannot be more than 50 megabytes (MB).

  • Endpoints per scraper – A scraper can scrape a maximum of 30,000 /metrics endpoints.

  • Scrape interval – The scraper config can't specify a scrape interval of less than 30 seconds.