Run a Script in a Cluster
Amazon EMR (Amazon EMR) enables you to run a script at any time during step processing in your
cluster. You specify a step that runs a script either when you create your cluster or you
can add a step if your cluster is in the
WAITING state. For more information
about adding steps, see
Submit Work to a
To run a script before step processing begins, use a bootstrap action. For more information about bootstrap actions, see (Optional) Create Bootstrap Actions to Install Additional Software.
Submitting a Custom JAR Step Using the AWS CLI
You can now use command-runner.jar in many cases instead of script-runner.jar. command-runner.jar does not need to have a full path for the JAR. For more information, see .
This section describes how to add a step to run a script. The
script-runner.jar takes arguments to the path to a script and
any additional arguments for the script. The JAR file runs the script with the passed
script-runner.jar is located at
region is the region in which your EMR cluster
The cluster containing a step that runs a script looks similar to the following examples.
To add a step to run a script using the AWS CLI
To run a script using the AWS CLI, type the following command, replace
myKeywith the name of your EC2 key pair and replace
mybucketwith your S3 bucket. This cluster runs the script
my_script.shon the master node when the step is processed.
aws emr create-cluster --name "
Test cluster" –-release-label
Pig--use-default-roles --ec2-attributes KeyName=
When you specify the instance count without using the
--instance-groupsparameter, a single master node is launched, and the remaining instances are launched as core nodes. All nodes use the instance type specified in the command.
If you have not previously created the default Amazon EMR service role and EC2 instance profile, type aws
emr create-default-rolesto create them before typing the
For more information on using Amazon EMR commands in the AWS CLI, see http://docs.aws.amazon.com/cli/latest/reference/emr.