Menu
Amazon EMR
Developer Guide

Launch the Cluster

The next step is to launch the cluster. This tutorial provides the steps to launch a long-running cluster using the Amazon EMR console and CLI. Choose the method that best meets your needs. When you launch the cluster, Amazon EMR provisions EC2 instances (virtual servers) to perform the computation. These EC2 instances are preloaded with an Amazon Machine Image (AMI) that has been customized for Amazon EMR and which has Hadoop and other big data applications preloaded.

To add Impala to a cluster using the console

  1. Open the Amazon EMR console at https://console.aws.amazon.com/elasticmapreduce/.

  2. Choose Create cluster, Go to advanced options.

  3. Choose a Release of 3.11.0, 3.10.0, or 3.9.0.

  4. Choose Impala 1.2.4.

  5. Specify Argumentsfor Impala to execute.

  6. Choose other applications to install and specify steps as appropriate for your application. Choose Next.

  7. Complete Step 2: Hardware, Step 3: General Cluster Settings, Step 4:Security as appropriate. Choose Create cluster.

To add Impala to a cluster using the AWS CLI

To add Impala to a cluster using the AWS CLI, type the create-cluster subcommand with the --applications parameter.

  • To install Impala on a cluster, type the following command and replace myKey with the name of your EC2 key pair.

    • Linux, UNIX, and Mac OS X users:

      aws emr create-cluster --name "Test cluster" --ami-version 3.3 --applications Name=Hue Name=Hive Name=Pig Name=Impala \ --use-default-roles --ec2-attributes KeyName=myKey \ --instance-type m3.xlarge --instance-count 3
    • Windows users:

      aws emr create-cluster --name "Test cluster" --ami-version 3.3 --applications Name=Hue Name=Hive Name=Pig Name=Impala --use-default-roles --ec2-attributes KeyName=myKey --instance-type m3.xlarge --instance-count 3

    When you specify the instance count without using the --instance-groups parameter, a single Master node is launched, and the remaining instances are launched as core nodes. All nodes will use the instance type specified in the command.

    Note

    If you have not previously created the default EMR service role and EC2 instance profile, type aws emr create-default-roles to create them before typing the create-cluster subcommand.