Getting started with Amazon EMR Serverless
This tutorial helps you get started with EMR Serverless when you deploy a sample Spark or
    Hive workload. You'll create, run, and debug your own application. We show default options in
    most parts of this tutorial.
Before you launch an EMR Serverless application, complete the following tasks.
      Grant permissions to use EMR Serverless
      To use EMR Serverless, you need a user or IAM role with an attached policy that
        grants permissions for EMR Serverless. To create a user and attach the appropriate policy
        to that user, follow the instructions in Grant permissions.
     
      Prepare storage for EMR Serverless
      In this tutorial, you'll use an S3 bucket to store output files and logs from the sample
        Spark or Hive workload that you'll run using an EMR Serverless application. To create a
        bucket, follow the instructions in Creating a bucket in the
          Amazon Simple Storage Service Console User Guide. Replace any further reference to
            amzn-s3-demo-bucket
     
      Create an EMR Studio to run interactive
          workloads
      If you want to use EMR Serverless to execute interactive queries through notebooks that
        are hosted in EMR Studio, you need to specify an S3 bucket and the minimum service role for EMR Serverless to create a Workspace. For steps to get
        set up, see Set up an EMR Studio
        in the Amazon EMR Management Guide. For more information on interactive workloads,
        see Run interactive workloads with EMR Serverless through
      EMR Studio.
     
      Create a job runtime role
      Job runs in EMR Serverless use a runtime role that provides granular permissions to
        specific AWS services and resources at runtime. In this tutorial, a public S3 bucket hosts
        the data and scripts. The bucket amzn-s3-demo-bucket
      To set up a job runtime role, first create a runtime role with a trust policy so that
        EMR Serverless can use the new role. Next, attach the required S3 access policy to that
        role. The following steps guide you through the process.
      
        - Console
- 
            - 
                Navigate to the IAM console at https://console.aws.amazon.com/iam/. 
- In the left navigation pane, choose Policies. 
- 
                Choose Create Policy. 
- 
                The Create policy page opens on a new tab. Select the Policy editor as Json and Paste the
                  policy JSON below. JSON - JSON
- 
     
- 
        
- 
             
                
                {
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Sid": "ReadAccessForEMRSamples",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::*.elasticmapreduce",
        "arn:aws:s3:::*.elasticmapreduce/*"
      ]
    },
    {
      "Sid": "FullAccessToOutputBucket",
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:GetObject",
        "s3:ListBucket",
        "s3:DeleteObject"
      ],
      "Resource": [
        "arn:aws:s3:::amzn-s3-demo-bucket",
        "arn:aws:s3:::amzn-s3-demo-bucket/*"
      ]
    },
    {
      "Sid": "GlueCreateAndReadDataCatalog",
      "Effect": "Allow",
      "Action": [
        "glue:GetDatabase",
        "glue:CreateDatabase",
        "glue:GetDataBases",
        "glue:CreateTable",
        "glue:GetTable",
        "glue:UpdateTable",
        "glue:DeleteTable",
        "glue:GetTables",
        "glue:GetPartition",
        "glue:GetPartitions",
        "glue:CreatePartition",
        "glue:BatchCreatePartition",
        "glue:GetUserDefinedFunctions"
      ],
      "Resource": [
        "*"
      ]
    }
  ]
}
 
 
 
 
- 
                Choose Next to enter a name for your policy,
                  such as EMRServerlessS3AndGlueAccessPolicyand Create policy
 
- 
                In the left navigation pane of IAM console , choose Roles. 
- 
                Choose Create role. 
- 
                For role type, choose Custom trust policy and paste the
                  following trust policy. This allows jobs submitted to your Amazon EMR Serverless
                  applications to access other AWS services on your behalf. JSON - JSON
- 
     
- 
        
- 
             
                
                {
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "sts:AssumeRole"
      ],
      "Resource": "arn:aws:iam::123456789012:role/EMRServerlessExecutionRole",
      "Sid": "AllowSTSAssumerole"
    }
  ]
} 
 
 
 
 
- 
                Choose Next to navigate to the Add
                    permissions page, then choose EMRServerlessS3AndGlueAccessPolicy. 
- 
                In the Name, review, and create page, for Role
                    name, enter a name for your role, for example,
                    EMRServerlessS3RuntimeRole. To create this IAM role, choose
                    Create role.
 
 
- CLI
- 
            - 
                Create a file named emr-serverless-trust-policy.jsonthat
                  contains the trust policy to use for the IAM role. The file should contain the
                  following policy.
 JSON - JSON
- 
     
- 
        
- 
             
                
                {
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Sid": "EMRServerlessTrustPolicy",
      "Action": [
        "sts:AssumeRole"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:iam::123456789012:role/EMRServerlessExecutionRole"
    }
  ]
}
 
 
 
 
- 
                Create an IAM role named EMRServerlessS3RuntimeRole. Use the
                  trust policy that you created in the previous step.
 aws iam create-role \
    --role-name EMRServerlessS3RuntimeRole \
    --assume-role-policy-document file://emr-serverless-trust-policy.json
 Note the ARN in the output. You use the ARN of the new role during job
                  submission, referred to after this as the
                      job-role-arn
 
- 
                Create a file named emr-sample-access-policy.jsonthat defines
                  the IAM policy for your workload. This provides read access to the script and
                  data stored in public S3 buckets and read-write access toamzn-s3-demo-bucket
 JSON - JSON
- 
     
- 
        
- 
             
                
                {
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Sid": "ReadAccessForEMRSamples",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::*.elasticmapreduce",
        "arn:aws:s3:::*.elasticmapreduce/*"
      ]
    },
    {
      "Sid": "FullAccessToOutputBucket",
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:GetObject",
        "s3:ListBucket",
        "s3:DeleteObject"
      ],
      "Resource": [
        "arn:aws:s3:::amzn-s3-demo-bucket",
        "arn:aws:s3:::amzn-s3-demo-bucket/*"
      ]
    },
    {
      "Sid": "GlueCreateAndReadDataCatalog",
      "Effect": "Allow",
      "Action": [
        "glue:GetDatabase",
        "glue:CreateDatabase",
        "glue:GetDataBases",
        "glue:CreateTable",
        "glue:GetTable",
        "glue:UpdateTable",
        "glue:DeleteTable",
        "glue:GetTables",
        "glue:GetPartition",
        "glue:GetPartitions",
        "glue:CreatePartition",
        "glue:BatchCreatePartition",
        "glue:GetUserDefinedFunctions"
      ],
      "Resource": [
        "*"
      ]
    }
  ]
}
 
 
 
 
- 
                Create an IAM policy named EMRServerlessS3AndGlueAccessPolicywith the policy file that you created in Step 3. Take note of
                  the ARN in the output, as you will use the ARN of the new policy in the next step.
 aws iam create-policy \
    --policy-name EMRServerlessS3AndGlueAccessPolicy \
    --policy-document file://emr-sample-access-policy.json
 Note the new policy's ARN in the output. You'll substitute it for
                      policy-arn
 
- 
                Attach the IAM policy EMRServerlessS3AndGlueAccessPolicyto the
                  job runtime roleEMRServerlessS3RuntimeRole.
 aws iam attach-role-policy \
    --role-name EMRServerlessS3RuntimeRole \
    --policy-arn policy-arn