Amazon Elastic MapReduce
Developer Guide

Getting Started

This documentation is for AMI versions 2.x and 3.x of Amazon EMR. For information about Amazon EMR releases 4.0.0 and above, see the Amazon EMR Release Guide. For information about managing the Amazon EMR service in 4.x releases, see the Amazon EMR Management Guide.

In this tutorial, you launch a long-running Amazon EMR cluster using the console. In addition to the console used in this tutorial, Amazon EMR provides a command-line client, a REST-like API, and several SDKs that you can use to launch and manage clusters. For more information about these interfaces, see What Tools are Available for Amazon EMR?.

After launching the cluster, you run a Hive script to analyze a series of CloudFront web distribution log files. After running the script, you query your data using the Hue web interface.

Tutorial Costs

The AWS service charges incurred by completing this tutorial include the cost of running an Amazon EMR cluster containing 3 m3.xlarge instances for one hour and the cost of storing log and output data in Amazon S3. The total cost of this tutorial is approximately $1.05 (depending on your region). Your actual costs may differ slightly from this estimate.

Service charges vary by region. If you are a new customer, within your first year of using AWS, the Amazon S3 storage charges are potentially waived, given you have not used the capacity allowed in the Free Usage Tier. Amazon EC2 and Amazon EMR charges resulting from this tutorial are not included in the Free Usage Tier, but they are minimal.

AWS service pricing is subject to change. For current pricing information, see the AWS Service Pricing Overview and use the AWS Simple Monthly Calculator to estimate your bill.