Configuration schema System metrics configurations examples Application metrics configurations examples Amazon Managed Service for Prometheus example

Configure CloudWatch agent for Amazon EMR 7.1.0

Starting with Amazon EMR 7.1.0, you can configure the Amazon CloudWatch agent to use additional system metrics, add application metrics, and change metrics destination by using the Amazon EMR configuration API. For more information about how to use the EMR configuration API to configure your cluster’s applications, see Configure applications.

Note

7.1.0 only supports the reconfiguration type OVERWRITE. For more information about the reconfiguration types, see Considerations when you reconfigure an instance group.

Topics

Configuration schema
System metrics configurations examples
Application metrics configurations examples
Amazon Managed Service for Prometheus example

Configuration schema

emr-metrics has the following classifications:

emr-system-metrics — configure system metrics, such as CPU, disk, and memory.
emr-hadoop-hdfs-datanode-metrics — configure Hadoop DataNode JMX metrics
emr-hadoop-hdfs-namenode-metrics — configure Hadoop NameNode JMX metrics
emr-hadoop-yarn-nodemanager-metrics — configure Yarn NodeManager JMX metrics
emr-hadoop-yarn-resourcemanager-metrics — configure Yarn ResourceManager JMX metrics
emr-hbase-master-metrics — configure HBase Master JMX metrics
emr-hbase-region-server-metrics — configure HBase Region Server JMX metrics
emr-hbase-rest-server-metrics — configure HBase REST Server JMX metrics
emr-hbase-thrift-server-metrics — configure HBase Thrift Server JMX metrics

The following tables describe the available properties and configurations for all of the classifications.

emr-metrics properties

Property	Required	Description	Default value	Possible values	Notes
`metrics_destination`	Optional	Determines whether cluster metrics are published to Amazon CloudWatch or Amazon Managed Service for Prometheus.	"CLOUDWATCH"	"CLOUDWATCH", "PROMETHEUS"	This property is case-insensitive. For example, "Cloudwatch" is the same as "CLOUDWATCH".
`prometheus_endpoint`	Optional	If `metrics_destination` is set to "PROMETHEUS", this property configures the CloudWatch agent to send metrics to the provided Amazon Managed Service for Prometheus remote write endpoint.	N/A	Any valid Amazon Managed Service for Prometheus remote write URL. The remote write URL format is `https://aps-workspaces.<region>.amazonaws.com/workspaces/<workspace_id>/api/v1/remote_write`	This field is required if `metrics_destination` is set to "PROMETHEUS". Provisioning will fail if you don't provide a key or if the value is an empty string.

emr-system-metrics properties

Property	Required	Description	Default value	Possible values	Notes
`metrics_collection_interval`	Optional	How often in seconds metrics are collected and published from the CloudWatch agent.	"60"	A string specifying the number of seconds. Only accepts whole numbers.	You can override this property with the `metrics_collection_interval` property from individual metric groups.

emr-system-metrics configurations

cpu

Property	Required	Description	Default value	Possible values	Notes
`metrics`	Optional	The list of CPU metrics for the agent to collect.	See Default metrics for CloudWatch agent with Amazon EMR	A comma-separated list of valid CPU metric names with or without the `cpu_` prefix, such as `usage_active` and `cpu_time_idle`. See Metrics collected by the CloudWatch agent for valid metrics.	Specifying an empty string means to not publish any CPU metrics.
`metrics_collection_interval`	Optional	How often in seconds the agent should collect and publish CPU metrics.	The value of the global `metrics_collection_interval`.	A string specifying the number of seconds. Accepts only whole numbers.	This value overrides the global `metrics_collection_interval` property only for CPU metrics.
`drop_original_metrics`	Optional	List of CPU metrics for which to not publish unaggregated metrics.	No unaggregated CPU metrics published.	A comma-separated list of CPU metrics that are also specified in the metrics property. An empty string means to publish all CPU metrics.	The CloudWatch agent aggregates all metrics by cluster ID, instance ID, node type, and service name. By default, the CloudWatch agent doesn't publish the per-resource metrics for metrics with multiple resources.
`resources`	Optional	Determines whether the agent will publish per-core metrics.	"*"	"*" enable per-core metrics. "" disable per-core metrics.	The CloudWatch agent only publishes per-core metrics for CPU metrics that aren't dropped in `drop_original_metrics`.

disk

Property	Required	Description	Default value	Possible values	Notes
`metrics`	Optional	The list of disk metrics for the agent to collect.	See Default metrics for CloudWatch agent with Amazon EMR	A comma-separated list of valid disk metric names with or without the `disk_` prefix, such as `disk_total` and `used_percent`. See Metrics collected by the CloudWatch agent for valid metrics.	Specifying an empty string means to not publish any disk metrics.
`metrics_collection_interval`	Optional	How often in seconds the agent should collect and publish disk metrics.	The value of the global `metrics_collection_interval`.	A string specifying the number of seconds. Accepts only whole numbers.	This value overrides the global `metrics_collection_interval` property only for disk metrics.
`drop_original_metrics`	Optional	List of disk metrics for which to not publish unaggregated metrics.	No unaggregated disk metrics published.	A comma-separated list of disk metrics that are also specified in the metrics property. An empty string means to publish all disk metrics.	The CloudWatch agent aggregates all metrics by cluster ID, instance ID, node type, and service name. By default, the CloudWatch agent doesn't publish the per-resource metrics for metrics with multiple resources.
`resources`	Optional	Determines whether the agent will publish per-mount-point metrics.	"*"	"*" means all mount points, "" means no mount points, or a comma-separated list of mount points. For example, `"/,/emr"`.	The CloudWatch agent only publishes per-mount-point metrics for disk metrics that aren't dropped in `drop_original_metrics`.

diskio

Property	Required	Description	Default value	Possible values	Notes
`metrics`	Optional	The list of disk IO metrics for the agent to collect.	See Default metrics for CloudWatch agent with Amazon EMR	A comma-separated list of valid disk IO metric names with or without the `diskio_` prefix, such as `diskio_reads` and `writes`. See Metrics collected by the CloudWatch agent for valid metrics.	Specifying an empty string means to not publish any disk IO metrics.
`metrics_collection_interval`	Optional	How often in seconds the agent should collect and publish disk IO metrics.	The value of the global `metrics_collection_interval`.	A string specifying the number of seconds. Accepts only whole numbers.	This value overrides the global `metrics_collection_interval` property only for disk IO metrics.
`drop_original_metrics`	Optional	List of disk IO metrics for which to not publish unaggregated metrics.	No unaggregated disk IO metrics published.	A comma-separated list of disk IO metrics that are also specified in the metrics property. An empty string means to publish all disk IO metrics.	The CloudWatch agent aggregates all metrics by cluster ID, instance ID, node type, and service name. By default, the CloudWatch agent doesn't publish the per-resource metrics for metrics with multiple resources.
`resources`	Optional	Determines whether the agent will publish per-device metrics.	"*"	"*" means all storage devices, "" means no storage devices, or a comma-separated list of device names. For example, `"nvme0n1,nvme1n1"`.	The CloudWatch agent only publishes per-device metrics for disk IO metrics that aren't dropped in `drop_original_metrics`.

mem

Property	Required	Description	Default value	Possible values	Notes
`metrics`	Optional	The list of memory metrics for the agent to collect.	See Default metrics for CloudWatch agent with Amazon EMR	A comma-separated list of valid memory metric names with or without the `mem_` prefix, such as `mem_available` and `available_percent`. See Metrics collected by the CloudWatch agent for valid metrics.	Specifying an empty string means to not publish any memory metrics.
`metrics_collection_interval`	Optional	How often in seconds the agent should collect and publish memory metrics.	The value of the global `metrics_collection_interval`.	A string specifying the number of seconds. Accepts only whole numbers.	This value overrides the global `metrics_collection_interval` property only for memory metrics.

net

Property	Required	Description	Default value	Possible values	Notes
`metrics`	Optional	The list of network metrics for the agent to collect.	See Default metrics for CloudWatch agent with Amazon EMR	A comma-separated list of valid network metric names with or without the `net_` prefix, such as `net_packets_sent` and `packets_recv`. See Metrics collected by the CloudWatch agent for valid metrics.	Specifying an empty string means to not publish any network metrics.
`metrics_collection_interval`	Optional	How often in seconds the agent should collect and publish network metrics.	The value of the global `metrics_collection_interval`.	A string specifying the number of seconds. Accepts only whole numbers.	This value overrides the global `metrics_collection_interval` property only for network metrics.
`drop_original_metrics`	Optional	List of network metrics for which to not publish unaggregated metrics.	No unaggregated network metrics published.	A comma-separated list of network metrics that are also specified in the metrics property. An empty string means to publish all network metrics.	The CloudWatch agent aggregates all metrics by cluster ID, instance ID, node type, and service name. By default, the CloudWatch agent doesn't publish the per-resource metrics for metrics with multiple resources.
`resources`	Optional	Determines whether the agent will publish per-interface metrics.	"*"	"*" means all network interfaces, "" means no network interfaces, or a comma-separated list of interfaces names. For example, `"eth0,eth1"`.	The CloudWatch agent only publishes per-interface metrics for network metrics that aren't dropped in `drop_original_metrics`.

netstat

Property	Required	Description	Default value	Possible values	Notes
`metrics`	Optional	The list of network statistics metrics for the agent to collect.	See Default metrics for CloudWatch agent with Amazon EMR	A comma-separated list of valid memory metric names with or without the `netstat_` prefix, such as `tcp_listen` and `netstat_udp_socket`. See Metrics collected by the CloudWatch agent for valid metrics.	Specifying an empty string means to not publish any network statistic metrics.
`metrics_collection_interval`	Optional	How often in seconds the agent should collect and publish network statistic metrics.	The value of the global `metrics_collection_interval`.	A string specifying the number of seconds. Accepts only whole numbers.	This value overrides the global `metrics_collection_interval` property only for network statistic metrics.

processes

Property	Required	Description	Default value	Possible values	Notes
`metrics`	Optional	The list of process metrics for the agent to collect.	See Default metrics for CloudWatch agent with Amazon EMR	A comma-separated list of valid memory metric names with or without the `processes_` prefix, such as `processes_running` and `total`. See Metrics collected by the CloudWatch agent for valid metrics.	Specifying an empty string means to not publish any process metrics.
`metrics_collection_interval`	Optional	How often in seconds the agent should collect and publish system process metrics.	The value of the global `metrics_collection_interval`.	A string specifying the number of seconds. Accepts only whole numbers.	This value overrides the global `metrics_collection_interval` property only for system process metrics.

swap

Property	Required	Description	Default value	Possible values	Notes
`metrics`	Optional	The list of swap metrics for the agent to collect.	See Default metrics for CloudWatch agent with Amazon EMR	A comma-separated list of valid memory metric names with or without the `swap_` prefix, such as `swap_free` and `used_percent`. See Metrics collected by the CloudWatch agent for valid metrics.	Specifying an empty string means to not publish any swap metrics.
`metrics_collection_interval`	Optional	How often in seconds the agent should collect and publish swap metrics.	The value of the global `metrics_collection_interval`.	A string specifying the number of seconds. Accepts only whole numbers.	This value overrides the global `metrics_collection_interval` property only for swap metrics.

emr-hadoop-hdfs-datanode-metrics properties

Property	Required	Description	Default value	Possible values
`<custom_bean_name>`	Optional	N/A	The MBean that CloudWatch agent should collect metrics from, such as `Hadoop:service=DataNode,name=DataNodeActivity`. You can find sample MBean names and their corresponding metrics in the example JMX YAML files for Amazon EMR release 7.0.	A string containing the comma-delimited list of metrics that are associated with the MBean. For example, `BlocksCached,BlocksRead`.
`otel.metric.export.interval`	Optional	How often in milliseconds to collect Hadoop DataNode metrics.	"60000"	A string specifying the number of milliseconds. Accepts only whole numbers.

emr-hadoop-hdfs-namenode-metrics properties

Property	Required	Description	Default value	Possible values
`<custom_bean_name>`	Optional	N/A	The MBean that CloudWatch agent should collect metrics from, such as `Hadoop:service=NameNode,name=FSNamesystem`. You can find sample MBean names and their corresponding metrics in the example JMX YAML files for Amazon EMR release 7.0.	A string containing the comma-delimited list of metrics that are associated with the MBean. For example, `BlockCapacity,CapacityUsedGB`.
`otel.metric.export.interval`	Optional	How often in milliseconds to collect Hadoop NameNode metrics.	"60000"	A string specifying the number of milliseconds. Accepts only whole numbers.

emr-hadoop-yarn-nodemanager-metrics properties

Property	Required	Description	Default value	Possible values
`<custom_bean_name>`	Optional	N/A	The MBean that CloudWatch agent should collect metrics from, such as `Hadoop:service=NodeManager,name=NodeManagerMetrics`. You can find sample MBean names and their corresponding metrics in the example JMX YAML files for Amazon EMR release 7.0.	A string containing the comma-delimited list of metrics that are associated with the MBean. For example, `MaxCapacity,AllocatedGB`.
`otel.metric.export.interval`	Optional	How often in milliseconds to collect Hadoop YARN NodeManager metrics.	"60000"	A string specifying the number of milliseconds. Accepts only whole numbers.

emr-hadoop-yarn-resourcemanager-metrics properties

Property	Required	Description	Default value	Possible values
`<custom_bean_name>`	Optional	N/A	The MBean that CloudWatch agent should collect metrics from, such as `Hadoop:service=ResourceManager,name=PartitionQueueMetrics`. You can find sample MBean names and their corresponding metrics in the example JMX YAML files for Amazon EMR release 7.0.	A string containing the comma-delimited list of metrics that are associated with the MBean. For example, `MaxCapacity,MaxCapacityVCores`.
`otel.metric.export.interval`	Optional	How often in milliseconds to collect Hadoop YARN ResourceManager metrics.	"60000"	A string specifying the number of milliseconds. Accepts only whole numbers.

emr-hbase-master-metrics properties

Property	Required	Description	Default value	Possible values
`<custom_bean_name>`	Optional	N/A	The MBean that CloudWatch agent should collect metrics from, such as `Hadoop:service=HBase,name=Master,sub=AssignmentManager`. You can find sample MBean names and their corresponding metrics in the example JMX YAML files for Amazon EMR release 7.0.	A string containing the comma-delimited list of metrics that are associated with the MBean. For example, `AssignFailedCount,AssignSubmittedCount`.
`otel.metric.export.interval`	Optional	How often in milliseconds to collect HBase Master metrics.	"60000"	A string specifying the number of milliseconds. Accepts only whole numbers.

emr-hbase-region-server-metrics properties

Property	Required	Description	Default value	Possible values
`<custom_bean_name>`	Optional	N/A	The MBean that CloudWatch agent should collect metrics from, such as `Hadoop:service=HBase,name=RegionServer,sub=IPC`. You can find sample MBean names and their corresponding metrics in the example JMX YAML files for Amazon EMR release 7.0.	A string containing the comma-delimited list of metrics that are associated with the MBean. For example, `numActiveHandler,numActivePriorityHandler`.
`otel.metric.export.interval`	Optional	How often in milliseconds to collect HBase Region Server metrics.	"60000"	A string specifying the number of milliseconds. Accepts only whole numbers.

emr-hbase-rest-server-metrics properties

Property	Required	Description	Default value	Possible values
`<custom_bean_name>`	Optional	N/A	The MBean that CloudWatch agent should collect metrics from, such as `Hadoop:service=HBase,name=REST`. You can find sample MBean names and their corresponding metrics in the example JMX YAML files for Amazon EMR release 7.0.	A string containing the comma-delimited list of metrics that are associated with the MBean. For example, `successfulPut,successfulScanCount`.
`otel.metric.export.interval`	Optional	How often in milliseconds to collect HBase Rest Server metrics.	"60000"	A string specifying the number of milliseconds. Accepts only whole numbers.

emr-hbase-thrift-server-metrics properties

Property	Required	Description	Default value	Possible values
`<custom_bean_name>`	Optional	N/A	The MBean that CloudWatch agent should collect metrics from, such as `Hadoop:service=HBase,name=Thrift,sub=ThriftOne`. You can find sample MBean names and their corresponding metrics in the example JMX YAML files for Amazon EMR release 7.0.	A string containing the comma-delimited list of metrics that are associated with the MBean. For example, `BatchGet_max,BatchGet_mean`.
`otel.metric.export.interval`	Optional	How often in milliseconds to collect HBase Thrift server metrics.	"60000"	A string specifying the number of milliseconds. Accepts only whole numbers.

System metrics configurations examples

The following example demonstrates how to configure the CloudWatch agent to stop exporting all system metrics.


[
  {
    "Classification": "emr-metrics",
    "Properties": {},
    "Configurations": [
      {
        "Classification": "emr-system-metrics",
        "Properties": {},
        "Configurations": []
      }
    ]
  }
]

The following example configures the CloudWatch agent to export the default system metrics. Doing so is a quick way to reset the agent back to only exporting the default system metrics if you've already reconfigured the system metrics at least once. This reset also removes any application metrics that were reconfigured before.


[
  {
    "Classification": "emr-metrics",
    "Properties": {},
    "Configurations": []
  }
]

The following example configures the cluster to export the cpu, mem, and the disk metrics.


[
  {
    "Classification": "emr-metrics",
    "Properties": {},
    "Configurations": [
      {
        "Classification": "emr-system-metrics",
        "Properties": {
          "metrics_collection_interval": "20"
        },
        "Configurations": [
          {
            "Classification": "cpu",
            "Properties": {
              "metrics": "cpu_usage_guest,cpu_usage_idle",
              "metrics_collection_interval": "30",
              "drop_original_metrics": "cpu_usage_guest"
            }
          },
          {
            "Classification": "mem",
            "Properties": {
              "metrics": "mem_active"
            }
          },
           {
            "Classification": "disk",
            "Properties": {
              "metrics": "disk_used_percent",
              "resources": "/,/mnt",
              "drop_original_metrics": ""
            }
          }
        ]
      }
    ]
  }
]

The previous example configuration has the following properties:

Every 30 seconds, the agent collects the cpu_guest metric for all CPUs. You can find the aggregated metric under the CloudWatch namespace CWAgent > cluster.id, instance.id, node.type, service.name.
Every 30 seconds, the agent collects the cpu_idle metric for all CPUs. You can find the aggregated metric under the CloudWatch namespace CWAgent > cluster.id, instance.id, node.type, service.name. The agent also collects the per-cpu metrics. You can find them in the same namespace. The agent collects this metric because the drop_original_metrics property doesn't contain cpu_idle, so the agent doesn't ignore the metric.
Every 20 seconds, the agent collects the mem_active metric. You can find the aggregated metric under the CloudWatch namespace CWAgent > cluster.id, instance.id, node.type, service.name.
Every 20 seconds, the agent collects the disk_used_percent metrics for the / and /mnt disk mounts. You can find the aggregated metrics under the CloudWatch namespace CWAgent > cluster.id, instance.id, node.type, service.name. The agent also collects the per-mount metrics. You can find them in the same namespace. The agent collects this metric because the drop_original_metrics property doesn't contain disk_used_percent, so the agent doesn't ignore the metric.

Application metrics configurations examples

The following example configures the CloudWatch agent to stop exporting metrics for the Hadoop Namenode service.


[
  {
    "Classification": "emr-metrics",
    "Properties": {},
    "Configurations": [
      {
        "Classification": "emr-hadoop-hdfs-namenode-metrics",
        "Properties": {},
        "Configurations": []
      }
    ]
  }
]

The following example configures a cluster to export Hadoop application metrics.


[
  {
    "Classification": "emr-metrics",
    "Properties": {},
    "Configurations": [
      {
        "Classification": "emr-hadoop-hdfs-namenode-metrics",
        "Properties": {
          "Hadoop:service=NameNode,name=FSNamesystem": "BlockCapacity,CapacityUsedGB",
          "otel.metric.export.interval": "20000" 
        },
        "Configurations": []
      },
       {
        "Classification": "emr-hadoop-hdfs-datanode-metrics",
        "Properties": {
          "Hadoop:service=DataNode,name=JvmMetrics": "MemNonHeapUsedM",
          "otel.metric.export.interval": "30000" 
        },
        "Configurations": []
      },
       {
        "Classification": "emr-hadoop-yarn-resourcemanager-metrics",
        "Properties": {
          "Hadoop:service=ResourceManager,name=CapacitySchedulerMetrics": "AllocateNumOps,NodeUpdateNumOps"
        },
        "Configurations": []
      }
    ]
  }
]

The previous example has the following properties:

Every 20 seconds, the agent collects the BlockCapacity and CapacityUsedGB metrics from instances running the Hadoop Namenode service.
Every 30 seconds, the agent collects MemNonHeapUsedM metrics from instances running the Hadoop Datanode service.
Every 30 seconds, the agent collects the AllocateNumOps and NodeUpdateNumOps metrics from instances that run the Hadoop YARN ResourceManaager.

Amazon Managed Service for Prometheus example

The following example demonstrates how to configure the CloudWatch agent to export metrics to Amazon Managed Service for Prometheus.

If you are currently exporting metrics to Amazon Managed Service for Prometheus and want to reconfigure the metrics for the cluster and continue exporting metrics to Amazon Managed Service for Prometheus, you must include the properties metrics_destination and prometheus_endpoint.


[
  {
    "Classification": "emr-metrics",
    "Properties": {
      "metrics_destination": "prometheus",
      "prometheus_endpoint": "http://amp-workspace/api/v1/remote_write"
    },
    "Configurations": []
  }
]

To use the CloudWatch agent to export metrics to CloudWatch, use the following example.


[
  {
    "Classification": "emr-metrics",
    "Properties": {
      "metrics_destination": "cloudwatch"
    },
    "Configurations": []
  }
]

Note

The CloudWatch agent has a Prometheus exporter that renames certain attributes. For the default metrics labels, Amazon Managed Service for Prometheus uses underscore characters in place of the periods that Amazon CloudWatch uses. If you use Amazon Managed Grafana to visualize the default metrics in Amazon Managed Service for Prometheus, the labels appear as cluster_id, instance_id, node_type, and service_name.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Configuration

Amazon EMR 7.0.0