Amazon EMR
Developer Guide

How Does Amazon EMR Hadoop Differ from Apache Hadoop?

This documentation is for AMI versions 2.x and 3.x of Amazon EMR. For information about Amazon EMR releases 4.0.0 and above, see the Amazon EMR Release Guide. For information about managing the Amazon EMR service in 4.x releases, see the Amazon EMR Management Guide.

The AWS version of Hadoop installed when you launch an Amazon EMR cluster is based on Apache Hadoop, but has had several patches and improvements added to make it work efficiently on AWS. Where appropriate, improvements written by the Amazon EMR team have been submitted to the Apache Hadoop code base. For more information about the patches applied to AWS Hadoop, see Hadoop Patches Applied in Amazon EMR.