Amazon Elastic MapReduce
Developer Guide (API Version 2009-03-31)
Did this page help you?  Yes | No |  Tell us about it...
« PreviousNext »
View the PDF for this guide.Go to the AWS Discussion Forum for this product.Go to the Kindle Store to download this guide in Kindle format.

Hadoop Memory-Intensive Configuration Settings (Legacy AMI 1.0.1 and earlier)

Note

The memory-intensive settings are set by default in AMI 2.0.0 and later. You should only need to adjust these settings for AMI versions 1.0.1 and earlier.

The Amazon EMR default configuration settings are appropriate for most workloads. However, based on your cluster’s specific memory and processing requirements, you might want to modify the configuration settings.

For example, if your cluster tasks are memory-intensive, you can use fewer tasks per core node and reduce your job tracker heap size. A predefined bootstrap action is available to configure your cluster on startup.

The following tables list the recommended configuration settings for each EC2 instance type. The default configurations for the cc2.8xlarge, hi1.4xlarge, hs1.8xlarge, and cg1.4xlarge instances are sufficient for memory-intensive workloads; therefore, the recommended configuration settings for these instances are not listed.

m1.small

ParameterValue
HADOOP_JOBTRACKER_HEAPSIZE 512
HADOOP_NAMENODE_HEAPSIZE 512
HADOOP_TASKTRACKER_HEAPSIZE 256
HADOOP_DATANODE_HEAPSIZE 128
mapred.child.java.opts -Xmx512m
mapred.tasktracker.map.tasks.maximum 2
mapred.tasktracker.reduce.tasks.maximum 1

m1.medium

ParameterValue
HADOOP_JOBTRACKER_HEAPSIZE 1536
HADOOP_NAMENODE_HEAPSIZE 512
HADOOP_TASKTRACKER_HEAPSIZE 256
HADOOP_DATANODE_HEAPSIZE 256
mapred.child.java.opts-Xmx768m
mapred.tasktracker.map.tasks.maximum 2
mapred.tasktracker.reduce.tasks.maximum 1

m1.large

ParameterValue
HADOOP_JOBTRACKER_HEAPSIZE 3072
HADOOP_NAMENODE_HEAPSIZE 1024
HADOOP_TASKTRACKER_HEAPSIZE 512
HADOOP_DATANODE_HEAPSIZE 512
mapred.child.java.opts -Xmx1024m
mapred.tasktracker.map.tasks.maximum 3
mapred.tasktracker.reduce.tasks.maximum 1

m1.xlarge

ParameterValue
HADOOP_JOBTRACKER_HEAPSIZE 9216
HADOOP_NAMENODE_HEAPSIZE 3072
HADOOP_TASKTRACKER_HEAPSIZE 512
HADOOP_DATANODE_HEAPSIZE 512
mapred.child.java.opts -Xmx1024m
mapred.tasktracker.map.tasks.maximum 8
mapred.tasktracker.reduce.tasks.maximum 3

c1.medium

ParameterValue
HADOOP_JOBTRACKER_HEAPSIZE 768
HADOOP_NAMENODE_HEAPSIZE 512
HADOOP_TASKTRACKER_HEAPSIZE 256
HADOOP_DATANODE_HEAPSIZE 128
mapred.child.java.opts -Xmx512m
mapred.tasktracker.map.tasks.maximum 2
mapred.tasktracker.reduce.tasks.maximum 1

c1.xlarge

ParameterValue
HADOOP_JOBTRACKER_HEAPSIZE 2048
HADOOP_NAMENODE_HEAPSIZE 1024
HADOOP_TASKTRACKER_HEAPSIZE 512
HADOOP_DATANODE_HEAPSIZE 512
mapred.child.java.opts -Xmx512m
mapred.tasktracker.map.tasks.maximum 7
mapred.tasktracker.reduce.tasks.maximum 2

m2.xlarge

ParameterValue
HADOOP_JOBTRACKER_HEAPSIZE 4096
HADOOP_NAMENODE_HEAPSIZE 2048
HADOOP_TASKTRACKER_HEAPSIZE 512
HADOOP_DATANODE_HEAPSIZE 512
mapred.child.java.opts -Xmx3072m
mapred.tasktracker.map.tasks.maximum 3
mapred.tasktracker.reduce.tasks.maximum 1

m2.2xlarge

ParameterValue
HADOOP_JOBTRACKER_HEAPSIZE 8192
HADOOP_NAMENODE_HEAPSIZE 4096
HADOOP_TASKTRACKER_HEAPSIZE 1024
HADOOP_DATANODE_HEAPSIZE 1024
mapred.child.java.opts -Xmx4096m
mapred.tasktracker.map.tasks.maximum 6
mapred.tasktracker.reduce.tasks.maximum 2

m2.4xlarge

ParameterValue
HADOOP_JOBTRACKER_HEAPSIZE 8192
HADOOP_NAMENODE_HEAPSIZE 8192
HADOOP_TASKTRACKER_HEAPSIZE 1024
HADOOP_DATANODE_HEAPSIZE 1024
mapred.child.java.opts -Xmx4096m
mapred.tasktracker.map.tasks.maximum 14
mapred.tasktracker.reduce.tasks.maximum 4