Amazon Elastic MapReduce
Developer Guide (API Version 2009-03-31)
Did this page help you?  Yes | No |  Tell us about it...
« PreviousNext »
View the PDF for this guide.Go to the AWS Discussion Forum for this product.Go to the Kindle Store to download this guide in Kindle format.

Availability Zones and Regions

When you launch a cluster, you have the option to specify a region, Availability Zone (based on EC2 Subnet), and a Spot Price, which is displayed for each Availability Zone in the Bid Price tooltip in the console. Amazon EMR launches all of the instance groups in the Availability Zone chosen by Amazon EMR based on acceptance of a bid price specified. You can also select the Availability Zone into which your cluster launches by selecting an EC2 Subnet. If you select a subnet, the Bid Price tooltip will indicate this by underlining the corresponding row of Availability Zone with Spot Price.

Because of fluctuating Spot Prices between Availability Zones, selecting the Availability Zone with the lowest initial price might not result in the lowest price for the life of the cluster. For optimal results, you should study the history of Availability Zone pricing before choosing the Availability Zone for your cluster.

Note

Because Amazon EMR selects the Availability Zone based on free capacity of Amazon EC2 instance type you specified for the core instance group, your cluster may end up in an Availability Zone with less capacity in other EC2 instance types. For example, if you are launching your core instance group as Large and the master instance group as Extra Large, you may launch into an Availability Zone with insufficient unused Extra Large capacity to fulfill a Spot Instance request for your master node. If you run into this situation, you can launch the master instance group as on-demand, even if you are launching the core instance group as Spot Instances.

All instance groups in a cluster are launched into a single Availability Zone, regardless of whether they are on-demand or Spot Instances. The reason for using a single Availability Zone is additional data transfer costs and performance overhead make running instance groups in multiple Availability Zones undesirable.