Getting Started: Analyzing Big Data with Amazon EMR
|This documentation is for versions 4.x and 5.x of Amazon EMR. For information about Amazon EMR AMI versions 2.x and 3.x, see the Amazon EMR Developer Guide (PDF).|
This section is a tutorial designed to walk you through the process of creating a sample Amazon EMR cluster by using the AWS Management Console. You then run a Hive script as a step to process sample data that is stored in Amazon S3. This tutorial is not meant for production environments, and it does not cover configuration options in depth. It is meant to help you quickly set up a cluster for evaluation purposes. If you have questions or get stuck, you can reach out to the Amazon EMR team by posting on our Discussion Forum.
The sample cluster that you create will be running in a live environment and you will be charged for the resources that you use. This tutorial should take an hour or less, so the charges that you incur should be minimal. After you complete this tutorial, you should reset your environment to avoid incurring further charges. For more information about resetting your environment, see Step 5: Reset Your Environment.
Pricing for Amazon EMR varies by region and service. In this tutorial, charges accrue for the Amazon EMR cluster and Amazon Simple Storage Service (Amazon S3) storage of the log data and output from the Hive job. If you are within your first year of using AWS, some or all of your charges for Amazon S3 might be waived if you are within your usage limits of the AWS Free Tier. For more information about Amazon EMR pricing and the AWS Free Tier, go to Amazon EMR Pricing and AWS Free Tier.
You can use the Amazon Web Services Simple Monthly Calculator to estimate your bill.