Step 1: Set Up Prerequisites for Your Sample Cluster
Before you begin setting up your Amazon EMR cluster, make sure that you complete the prerequisites in this topic.
Sign Up for AWS
If you do not have an AWS account, use the following procedure to create one.
To sign up for AWS
Open https://aws.amazon.com/ and choose Create an AWS Account.
Follow the online instructions.
Create an Amazon S3 Bucket
In this tutorial, you use an Amazon S3 bucket to store your log files and output data. Because of Hadoop requirements, S3 bucket names used with Amazon EMR have the following constraints:
Must contain only lowercase letters, numbers, periods (.), and hyphens (-)
Cannot end in numbers
If you already have a bucket that meets these requirements, you can use it for this tutorial. Otherwise, create a bucket to use. For more information about creating buckets, go to Create a Bucket in the Amazon Simple Storage Service Getting Started Guide.
In your S3 bucket, create folders named
the output folder should be empty. For more information about creating folders, go
to Creating A Folder in the
Amazon Simple Storage Service Console User Guide.
Create an Amazon EC2 Key Pair
You must have an Amazon Elastic Compute Cloud (Amazon EC2) key pair to connect to the nodes in your cluster over a secure channel using the Secure Shell (SSH) protocol. If you already have a key pair that you want to use, you can skip this step. If you don't have a key pair, follow one of the following procedures depending on your operating system.