Menu
Amazon EMR
Amazon EMR Release Guide

Create a Cluster With Spark

To launch a cluster with Spark installed using the console

The following procedure creates a cluster with Spark installed. For more information about launching clusters with the console, see Step 3: Launch an Amazon EMR Cluster in the Amazon EMR Management Guide.

  1. Open the Amazon EMR console at https://console.aws.amazon.com/elasticmapreduce/.

  2. Choose Create cluster to use Quick Create.

  3. For Software Configuration, choose Amazon Release Version emr-5.5.0 or later.

  4. For Select Applications, choose either All Applications or Spark.

  5. Select other options as necessary and then choose Create cluster.

    Note

    To configure Spark when you are creating the cluster, see Configure Spark.

To launch a cluster with Spark installed using the AWS CLI

  • Create the cluster with the following command:

    Copy
    aws emr create-cluster --name "Spark cluster" --release-label emr-5.5.0 --applications Name=Spark \ --ec2-attributes KeyName=myKey --instance-type m3.xlarge --instance-count 3 --use-default-roles

Note

Linux line continuation characters (\) are included for readability. They can be removed or used in Linux commands. For Windows, remove them or replace with a caret (^).

To launch a cluster with Spark installed using the SDK for Java

Specify Spark as an application with SupportedProductConfig used in RunJobFlowRequest.

  • The following Java program excerpt shows how to create a cluster with Spark:

    Copy
    AmazonElasticMapReduceClient emr = new AmazonElasticMapReduceClient(credentials); Application sparkApp = new Application() .withName("Spark"); Applications myApps = new Applications(); myApps.add(sparkApp); RunJobFlowRequest request = new RunJobFlowRequest() .withName("Spark Cluster") .withApplications(myApps) .withReleaseLabel("emr-5.5.0") .withInstances(new JobFlowInstancesConfig() .withEc2KeyName("myKeyName") .withInstanceCount(1) .withKeepJobFlowAliveWhenNoSteps(true) .withMasterInstanceType("m3.xlarge") .withSlaveInstanceType("m3.xlarge") ); RunJobFlowResult result = emr.runJobFlow(request);