Amazon EMR uses an Amazon Machine Image (AMI) to install Linux, Hadoop, and other software on the virtual servers that it launches in the cluster. New versions of the Amazon EMR AMI are released on a regular basis, adding new features and fixing issues. We recommend that you use the latest AMI to launch your cluster whenever possible. The latest version of the AMI is the default when you launch a cluster from the console.
The AWS version of Hadoop installed by Amazon EMR is based on Apache Hadoop, with patches and improvements added that make it work efficiently with AWS. Each Amazon EMR AMI has a default version of Hadoop associated with it. If your application requires a different version of Hadoop than the default, specify that Hadoop version when you launch the cluster.
In addition to the standard software installed on the cluster, you can use bootstrap actions to install additional software and to change the configuration of applications on the cluster. Bootstrap actions are scripts that are run on the virtual servers when Amazon EMR launches the cluster. You can write custom bootstrap actions, or use predefined bootstrap actions provided by Amazon EMR. A common use of bootstrap actions is to change the Hadoop configuration settings.
For more information, see the following topics: