Launch an Amazon EMR Cluster with multiple primary nodes
This topic provides configuration details and examples for launching an Amazon EMR cluster with multiple primary nodes.
Note
Amazon EMR automatically enables termination protection for all clusters that have multiple primary nodes, and overrides any auto-termination settings that you supply when you create the cluster. To shut down a cluster with multiple primary nodes, you must first modify the cluster attributes to disable termination protection. For instructions, see Terminate an Amazon EMR Cluster with multiple primary nodes.
Prerequisites
-
You can launch an Amazon EMR cluster with multiple primary nodes in both public and private VPC subnets. EC2-Classic is not supported. To launch an Amazon EMR cluster with multiple primary nodes in a public subnet, you must enable the instances in this subnet to receive a public IP address by selecting Auto-assign IPv4 in the console or running the following command. Replace
22XXXX01
with your subnet ID.aws ec2 modify-subnet-attribute --subnet-id subnet-
22XXXX01
--map-public-ip-on-launch -
To run Hive, Hue, or Oozie on an Amazon EMR cluster with multiple primary nodes, you must create an external metastore. For more information, see Configuring an external metastore for Hive, Using Hue with a remote database in Amazon RDS, or Apache Oozie.
-
To use Kerberos authentication in your cluster, you must configure an external KDC. For more information, see Configuring Kerberos on Amazon Amazon EMR.
Launch an Amazon EMR Cluster with multiple primary nodes
You can launch a cluster with multiple primary nodes when you use instance groups
or instance fleets. When you use instance groups with multiple
primary nodes, you must specify an instance count value of 3
for the
primary node instance group. When you use instance fleets with
multiple primary nodes, you must specify the TargetOnDemandCapacity
of
3
, TargetSpotCapacity
of 0
for the
primary instance fleet, and WeightedCapacity
of 1
for each
instance type that you configure for the primary fleet.
The following examples demonstrate how to launch the cluster using the default AMI or a custom AMI with both instance groups and instance fleets:
Note
You must specify the subnet ID when you launch an Amazon EMR cluster with multiple primary nodes using
the AWS CLI. Replace 22XXXX01
and 22XXXX02
with your subnet ID in the following examples.
Terminate an Amazon EMR Cluster with multiple primary nodes
To terminate an Amazon EMR cluster with multiple primary nodes, you must disable termination protection
before terminating the cluster, as the following example demonstrates. Replace
j-3KVTXXXXXX7UG
with your cluster ID.
aws emr modify-cluster-attributes --cluster-id
j-3KVTXXXXXX7UG
--no-termination-protected aws emr terminate-clusters --cluster-idj-3KVTXXXXXX7UG