Amazon Elastic MapReduce
Developer Guide (API Version 2009-03-31)

Plan an Amazon EMR Cluster

This documentation is for AMI versions 2.x and 3.x of Amazon EMR. See the Amazon EMR Release Guide for information about Amazon EMR releases 4.0.0 and above. For information about managing the Amazon EMR service in 4.x releases, see the Amazon EMR Management Guide.

This section explains configuration options for launching Amazon Elastic MapReduce (Amazon EMR) clusters. Before you launch a cluster, review this information and make choices about the cluster options based on your data processing needs. The options that you choose depend on factors such as the following:

  • The type of source data that you want to process

  • The amount of source data and how you want to store it

  • The acceptable duration and frequency of processing source data

  • The network configuration and access control requirements for cluster connectivity

  • The metrics for monitoring cluster activities, performance, and health

  • The software that you choose to install in your cluster to process and analyze data

  • The cost to run clusters based on the options that you choose

Although some configuration steps are optional, we recommend that you review them to make sure that you understand the options available to you and plan your cluster accordingly.