Create an EMR Studio - Amazon EMR

Create an EMR Studio

Create an EMR Studio for your team using the Amazon EMR console or the AWS CLI. You can also use the AWS::EMR::Studio resource template to create a Studio using AWS CloudFormation.

Prerequisites

Before you create an EMR Studio, you must have the following:

  • AWS Single Sign-On enabled. For more information, see Enable AWS Single Sign-On for Amazon EMR Studio.

  • Permission to create and manage an EMR Studio. For more information, see Add required permissions to create and manage an EMR Studio.

  • An IAM service role, IAM user role, IAM session policies, and security groups set up for Amazon EMR Studio. For more information, see EMR Studio security and access control.

  • A designated Amazon S3 bucket where EMR Studio can back up the Workspaces and notebook files in your Studio. Your EMR Studio service role must have read and write access to the bucket that you select.

  • An Amazon Virtual Private Cloud (VPC) designated for your Studio. If you plan to use Amazon EMR on EKS with EMR Studio, choose the same VPC to which your Amazon EKS cluster worker nodes belong.

  • A maximum of five subnets that belong to your designated VPC to associate with the Studio. If you plan to use publicly-hosted Git repositories, you must use private subnets that have access to the internet through Network Address Translation (NAT).

    Note

    There must be at least one subnet in common between your EMR Studio and the Amazon EKS cluster that you use to register your virtual cluster. Otherwise, your managed endpoint will not appear as an option in your Studio Workspaces. You can create an Amazon EKS cluster and associate it with a subnet that belongs to your Studio. Alternatively, you can create a Studio and specify your EKS cluster's subnets.

  • The AWS Command Line Interface version 1.18.184 or later, or version 2.1.4 or later if you want to use the AWS CLI. Use credentials for the AWS account (a member account) which you have designated for your Studio.

Important

Make sure you disable proxy management tools such as FoxyProxy or SwitchyOmega in your browser before you create a Studio. Active proxies can cause errors when you choose Create Studio, and result in a Network Failure error message.

Instructions

Console

To create an EMR Studio using the EMR console

  1. Tag the following EMR Studio resources with the tag key "for-use-with-amazon-emr-managed-policies" and tag value "true". The EMR Studio service role uses this tag to identify and access your resources.

    • Your designated VPC

    • Each subnet that you want to associate with your Studio

    • Your EMR Studio security groups (if using custom security groups)

    • Any existing AWS Secrets Manager secrets that Studio users might use to link Git repositories to a Workspace

    You can apply tags to resources using the Tags tab on the relevant resource screen in the AWS Management Console. Alternatively, you can the AWS Resource Groups Tag Editor to apply the required tag to all of your EMR Studio resources at once. For more information, see Tag Editor in the AWS Resource Groups User Guide.

  2. Open the Amazon EMR console at https://console.aws.amazon.com/elasticmapreduce/home

  3. Choose EMR Studio from the left navigation.

  4. Choose Create Studio to open the Create a Studio page.

  5. Enter a Studio name and an optional Description.

  6. (Optional) Choose Add new tag to add one or more a key-value tags to the Studio. Tags help you manage, identify, organize, and filter resources. For more information, see Tagging AWS resources.

  7. Under Define roles and S3 bucket, choose a Service Role and a User Role. For more information, see Create an EMR Studio service role and Create an EMR Studio user role with session policies.

  8. Under S3 Bucket, choose Browse S3 to select the bucket that you designated for backing up Workspaces and notebook files.

    Note

    Your EMR Studio service role must have read and write access to the bucket that you select.

  9. Under Networking and security, choose an Amazon Virtual Private Cloud (VPC) for your Studio from the dropdown list.

  10. Under Subnets, select a maximum of five subnets from your VPC to associate with the Studio. You have the option to add more subnets after you create the Studio.

  11. For Security and access, choose either the default security groups or custom security groups. For more information, see Define security groups to control EMR Studio network traffic.

    If you choose... Do this...
    The default EMR Studio security groups

    To enable Git-based repository linking for your Studio, choose Enable clusters/endpoints and Git repository. Otherwise choose Enable clusters/endpoints.

    Custom security groups for your Studio
    • Under Cluster/endpoint security group, select the engine security group that you configured from the dropdown list. Your Studio uses this security group to allow inbound access from attached Workspaces.

    • Under Workspace security group, select your Workspace security group that you configured from the dropdown list. Your Studio uses this security group with Workspaces to provide outbound access to attached Amazon EMR clusters and publicly-hosted Git repositories.

  12. Choose Create Studio to finish and navigate to the Studios page. You should see your new Studio in the list with details such as Studio name, Create date, and Studio access URL.

    Important

    Make sure you disable proxy management tools such as FoxyProxy or SwitchyOmega in your browser. Active proxies can cause errors when you choose Create Studio, and result in a Network Failure error message.

After you create a Studio, follow the instructions in Assign a user or group to your EMR Studio.

CLI

Before you create an EMR Studio using the AWS CLI

Tag the following EMR Studio resources with the tag key "for-use-with-amazon-emr-managed-policies" and tag value "true". The EMR Studio service role uses this tag to identify and access your resources.

  • Your designated VPC

  • Each subnet that you want to associate with your Studio

  • Your EMR Studio security groups

  • Any existing AWS Secrets Manager secrets that Studio users might use to link Git repositories to a Workspace

You can apply tags to resources using the Tags tab on the relevant resource screen in the AWS Management Console. Alternatively, you can the AWS Resource Groups Tag Editor to apply the required tag to all of your EMR Studio resources at once. For more information, see Tag Editor in the AWS Resource Groups User Guide.

To create an EMR Studio using the AWS CLI

Use the create-studio AWS CLI command. Insert your own values for the following options. For more information about the create-studio command, see AWS CLI Command Reference.

aws emr create-studio \ --name <example-studio-name> \ --auth-mode <SSO> \ --vpc-id <example-vpc-id> \ --subnet-ids <subnet-id-1> <subnet-id-2>... <subnet-id-5> \ --service-role <example-studio-service-role-name> \ --user-role <example-studio-user-role-name> \ --workspace-security-group-id <example-workspace-sg-id> \ --engine-security-group-id <example-engine-sg-id> \ --default-s3-location <example-s3-location>
Note

Linux line continuation characters (\) are included for readability. They can be removed or used in Linux commands. For Windows, remove them or replace with a caret (^).

The following is an example of the output you should see after you successfully create the Studio.

{ StudioId: "es-123XXXXXXXXX", Url: "https://es-123XXXXXXXXX.emrstudio-prod.us-east-1.amazonaws.com" }

Copy the StudioId, which you use to Assign a user or group to your EMR Studio. You should also note the Url, which is the Studio access URL that your team can use to log in to the Studio.