Deduplicating high availability metrics sent to Amazon Managed Service for Prometheus - Amazon Managed Service for Prometheus

Deduplicating high availability metrics sent to Amazon Managed Service for Prometheus

You can send data from multiple Prometheus agents (Prometheus instances running in Agent mode) to your Amazon Managed Service for Prometheus workspace. If some of these instances are recording and sending the same metrics, your data will have a higher availability (even if one of the agents stops sending data, the Amazon Managed Service for Prometheus workspace will still receive the data from another instance). However, you want your Amazon Managed Service for Prometheus workspace to automatically de-duplicate the metrics so that you don't see the metrics multiple times, and aren't charged for the data ingestion and storage multiple times.

For Amazon Managed Service for Prometheus to automatically de-duplicate data from multiple Prometheus agents, you give the set of agents that are sending the duplicate data a single cluster name, and each of the instances a replica name. The cluster name identifies the instances as having shared data, and the replica name allows Amazon Managed Service for Prometheus to identify the source of each metric. The final stored metrics include the cluster label, but not the replica, so the metrics appear to be coming from a single source.

Note

Certain versions of Kubernetes (1.28 and 1.29) may emit their own metric with a cluster label. This can cause issues with Amazon Managed Service for Prometheus deduplication. See the High availability FAQ for more information.

The following topics show how to send data and include the cluster and __replica__ labels, so that Amazon Managed Service for Prometheus de-duplicates the data automatically.

Important

If you do not set up deduplication, you will be charged for all data samples that are sent to Amazon Managed Service for Prometheus. These data samples include duplicate samples.