Amazon EMR
Developer Guide


Amazon EMR defines three roles for the servers in a cluster. These different roles are referred to as node types. The Amazon EMR node types map to the master and slave roles defined in Hadoop.

  • Master node — Manages the cluster: coordinating the distribution of the MapReduce executable and subsets of the raw data, to the core and task instance groups. It also tracks the status of each task performed, and monitors the health of the instance groups. There is only one master node in a cluster. This maps to the Hadoop master node.

  • Core nodes — Runs tasks and stores data using the Hadoop Distributed File System (HDFS). This maps to a Hadoop slave node.

  • Task nodes (optional) — Run tasks. This maps to a Hadoop slave node.

For more information, see Create a Cluster with Instance Fleets or Uniform Instance Groups. For details on mapping legacy clusters to instance groups, see Mapping Legacy Clusters to Instance Groups.