| « PreviousNext » | |
![]() ![]() ![]() | Did this page help you? Yes | No | Tell us about it... |
The version of Hive and Pig you have installed on your cluster depends on the Hadoop
version installed on your cluster. For Hadoop version 1.0.3, Hive version 0.8.1 and
Pig version 0.9.2 is used. For Hadoop version 0.20.205, Hive version 0.7.1 and Pig
version 0.9.1 is used. For Hadoop version 0.20, Hive version 0.5 and version Pig 0.6
is used. For Hadoop version 0.18, Hive version 0.4 and Pig version 0.3 is used. The
version can be selected by setting HadoopVersion in
JobFlowInstancesConfig.
The Amazon EMR console supports Hadoop 1.0.3 with Hive 0.8.1 and Pig 0.9.2.
The default version of Hadoop for the Amazon EMR console, and the command line interface is Hadoop 1.0.3 with Hive 0.8.1 and Pig 0.9.2. You can continue running Hadoop 0.18 with Hive 0.4 for the remainder of the Hadoop 0.18 lifecycle. Additional versions of Hive are available on the command line interface through Hive versioning, for more information, go to Supported Hive Versions
For all clusters run from the Amazon EMR APIs or Java SDK, the default version of Hadoop is 0.18 with Hive 0.4 and Pig 0.3. This is to maintain compatibility with existing libraries and systems. You can continue running Hadoop 0.18 with Hive 0.4 and Pig 0.3 from the Amazon EMR API or Java SDK for the remainder of the Hadoop 0.18 lifecycle, but you should consider upgrading as soon as possible to take advantage of the features and performance improvements found in Hadoop 1.0.3, Hive 0.8.1, and Pig 0.9.2.
For more information, see Default AMI and Hadoop Versions.
You can choose to continue running Hadoop 0.18 with Hive 0.4 using either the command line interface or
the Amazon EMR API with the HadoopVersion in the
RunJobFlow function. This parameter accepts values 0.18,
0.20, 0.20.205, and 1.0.3
We have regenerated the client libraries to support the new API. Old clients and libraries continue to default to Hadoop 0.18. If you
update to the new clients and want to run Hadoop 0.18, you must to explicitly specify the version 0.18 in your requests.
The CLI defaults to run Hadoop 1.0.3. In order to run Hadoop version 0.18 you can either use an earlier
version of the Ruby client, or specify –HadoopVersion=0.18 when creating the clusters. As with
other options in the command line client, you can specify the –HadoopVersion
parameter in your .credentials file.