Using job execution roles with EMR Serverless - Amazon EMR

Amazon EMR Serverless is in preview release and is subject to change. To use EMR Serverless in preview, follow the sign up steps at https://pages.awscloud.com/EMR-Serverless-Preview.html. The only Region that EMR Serverless currently supports is us-east-1, so make sure to set all region parameters to this value. All Amazon S3 buckets used with EMR Serverless must also be created in us-east-1.

Using job execution roles with EMR Serverless

Create an execution role

A job's execution role is an IAM role that grants the job permission to access AWS services and resources. You provide this role when you start a job, and EMR Serverless assumes the role when the job is invoked. Before you use the execution role, the administrator IAM role should specify that the execution role has access to the given application. The steps to invoke this API with a given execution role are listed in the following sections.

  1. Set up an execution role

    The administrator IAM role creates an execution role to be used with a particular application. The following policy for the job execution role allows access to resource targets in Amazon S3. These permissions are necessary to monitor jobs and access logs. Replace DOC-EXAMPLE-BUCKET-INPUT with the name of the S3 bucket where you want EMR Serverless to retrieve input data. Replace DOC-EXAMPLE-BUCKET-OUTPUT with the name of the S3 bucket where you want EMR Serverless to store job output data. Replace DOC-EXAMPLE-BUCKET-LOGGING with the name of the S3 bucket where you want EMR Serverless to store job logs.

    { "Version": "2012-10-17", "Statement": [ { "Sid": "ReadFromOutputAndInputScriptBuckets", "Effect": "Allow", "Action": [ "s3:GetObject", "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::DOC-EXAMPLE-BUCKET-INPUT", "arn:aws:s3:::DOC-EXAMPLE-BUCKET-INPUT/*", "arn:aws:s3:::DOC-EXAMPLE-BUCKET-OUTPUT", "arn:aws:s3:::DOC-EXAMPLE-BUCKET-OUTPUT/*" ] }, { "Sid": "WriteToOutputDataBucket", "Effect": "Allow", "Action": [ "s3:PutObject", "s3:DeleteObject" ], "Resource": [ "arn:aws:s3:::DOC-EXAMPLE-BUCKET-OUTPUT/*" ] }, { "Sid": "LoggingReadAndWriteStatement", "Effect": "Allow", "Action": [ "s3:PutObject", "s3:GetObject", "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::DOC-EXAMPLE-BUCKET-LOGGING", "arn:aws:s3:::DOC-EXAMPLE-BUCKET-LOGGING/*" ] } ] }

    With Hive, you'll need to add a policy for Glue as well. The following example Glue policy allows Create and Read access.

    { "Version": "2012-10-17", "Statement": [ { "Sid": "GlueCreateAndReadDataCatalog", "Effect": "Allow", "Action": [ "glue:GetDatabase", "glue:GetDataBases", "glue:CreateTable", "glue:GetTable", "glue:GetTables", "glue:GetPartition", "glue:GetPartitions", "glue:CreatePartition", "glue:BatchCreatePartition", "glue:GetUserDefinedFunctions" ], "Resource": ["*"] } ] }
  2. Create a trust policy to allow EMR Serverless to use the execution role

    For an execution role to be used with a particular application, the IAM administrator role must update the trust policy of the execution role. This trust relationship allows EMR Serverless to assume the newly created job execution role and stream the credentials to applications running your code. The trust policy for the EMR Serverless is as follows.

    { "Version": "2012-10-17", "Statement": [{ "Sid": "EMRServerlessTrustPolicy", "Action": "sts:AssumeRole", "Effect": "Allow", "Principal": { "Service": "emr-serverless.amazonaws.com" } }] }

    For more information about how to create IAM roles, see Creating IAM roles. To learn how to update your trust policy, see Modifying a role.

  3. Allow the job submitter to pass the execution role to EMR Serverless

    After an application has been created but before jobs can start running on that application, the administrator IAM role must attach the following permissions policy to the IAM user, IAM group, or IAM role of the job submitter. This permissions policy allows the job submitter’s IAM identity (an IAM user or IAM role) to submit a job on the application. This policy ensures that the StartJobRun action is allowed on application <application_id> using job execution role ARN <iam_execution_role_arn>.

    { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": "emr-serverless:StartJobRun", "Resource": "arn:aws:emr-serverless:<region>:<aws_account_id>:/applications/<application_id>" }, { "Effect": "Allow", "Action": "iam:PassRole", "Resource": "<iam_execution_role_arn>", "Condition": { "StringLike": { "iam:PassedToService": "emr-serverless.amazonaws.com" } } } ] }

    After this is complete, the job submitter should be able to pass the execution role to EMR Serverless when submitting a job.

Limit execution roles

If the Amazon EMR application administrator creates a multi-tenanted EMR Serverless application, and the adminstrator IAM role onboards multiple execution roles that can be used to submit jobs by untrusted tenants, then you may want to restrict those tenants. Use either of the following options to restrict tenants from running code that gains the privileges assigned to the execution roles.

Restrict Execution Role ARN(s)

Multiple execution roles can be on-boarded to a single EMR Serverless application. If you want to control which IAM identities can use each execution role to invoke the emr-serverless:StartJobRun API on an EMR Serverless application, use the administrator IAM role to modify the IAM policy attached to the IAM identity you want to restrict. You can do this by restricting the resources for iam:PassRole action. This action accepts a list of execution role ARNs that the administrator IAM role permits for use with the application. The updated permissions policy would look like the following example.

{ "Version": "2012-10-17", "Statement": [{ "Effect": "Allow", "Action": "iam:PassRole", "Resource": [ "<execution_role_arn_1>", "<execution_role_arn_2>", ... ], "Condition": { "StringLike": { "iam:PassedToService": "emr-serverless.amazonaws.com" } } }] }

If you want to allow all execution roles that start with a particular prefix, such as MyRole, then replace the condition operator StringEquals with the StringLike operator, and replace the execution_role_arn value in the condition with a wildcard * character, such as arn:aws:iam::<AWS_ACCOUNT_ID>:role/MyRole*. All other string condition keys are also supported.

Create multiple applications

EMR Serverless provides complete network isolation for jobs running in different applications. We recommend that you run jobs that require different privileges in separate applications. Instead of creating a single multi-tenanted application with multiple execution roles with different levels of privileges, the EMR Serverless administrator can create multiple applications, isolating execution roles with similar privileges into a single application. Then, tenants can be given permissions to use only a specific application and the specific execution roles on-boarded to that application.

EMR Serverless creates full network isolation between jobs belonging to different EMR Serverless applications. In cases where job-level isolation is desired, consider isolating jobs into different EMR Serverless applications.