Create an EMR Studio - Amazon EMR

Create an EMR Studio

You can create an EMR Studio for your team using the Amazon EMR console or the AWS CLI. Creating a Studio instance is part of setting up Amazon EMR Studio.

Prerequisites

Before you create a Studio, make sure you've completed the previous tasks in Set up an EMR Studio.

To create a Studio using the AWS CLI, you should have the latest version installed. For more information, see Installing or updating the latest version of the AWS CLI.

Important

Deactivate proxy management tools such as FoxyProxy or SwitchyOmega in the browser before you create a Studio. Active proxies can result in a Network Failure error message when you choose Create Studio.

Console

To create an EMR Studio using the EMR console

  1. Open the Amazon EMR console at https://console.aws.amazon.com/elasticmapreduce/home

  2. Choose EMR Studio from the left navigation.

  3. Choose Create Studio to open the Create a Studio page.

  4. Enter a Studio name and an optional Description.

  5. If you use IAM authentication for the Studio, you can choose Add new tag to add one or more key-value tags of your choice to the Studio. You use tags to give specified users access to the Studio. For more information, see Assign a user or group to an EMR Studio.

    You can also add tags to help you manage, identify, organize, and filter Studios. For more information, see Tagging AWS resources.

  6. Under Networking, choose an Amazon Virtual Private Cloud (VPC) for the Studio from the dropdown list.

  7. Under Subnets, select a maximum of five subnets in your VPC to associate with the Studio. You have the option to add more subnets after you create the Studio.

  8. For Security groups, choose either the default security groups or custom security groups. For more information, see Define security groups to control EMR Studio network traffic.

    If you choose... Do this...
    The default EMR Studio security groups

    To enable Git-based repository linking for the Studio, choose Enable clusters/endpoints and Git repository. Otherwise choose Enable clusters/endpoints.

    Custom security groups for your Studio
    • Under Cluster/endpoint security group, select the engine security group that you configured from the dropdown list. Your Studio uses this security group to allow inbound access from attached Workspaces.

    • Under Workspace security group, select the Workspace security group that you configured from the dropdown list. Your Studio uses this security group with Workspaces to provide outbound access to attached Amazon EMR clusters and publicly hosted Git repositories.

  9. Under Authentication, choose an authentication mode for the Studio and provide information according to the following table. To learn more about authentication for EMR Studio, see Choose an authentication mode for Amazon EMR Studio.

    If you use... Do this...
    IAM authentication or federation

    Choose a login method for the Studio.

    If you want federated users to log in using the Studio URL and credentials for your identity provider (IdP), select your IdP from the dropdown list, and enter your Identity provider (IdP) login URL and RelayState parameter name.

    For a list of IdP authentication URLs and RelayState names, see Identity provider RelayState parameters and authentication URLs.

    Then, select your EMR Studio Service role from the dropdown list. For more information, see Create an EMR Studio service role.

    AWS SSO authentication Select your EMR Studio Service Role and User Role. For more information, see Create an EMR Studio service role and Create an EMR Studio user role for AWS SSO authentication mode.
  10. Under Workspace storage, choose Browse S3 to select your Amazon S3 bucket for backing up Workspaces and notebook files.

    Note

    Your EMR Studio service role must have read and write access to the bucket that you select.

  11. Choose Create Studio to finish and navigate to the Studios page. Your new Studio appears in the list with details such as Studio name, Creation date, and Studio access URL.

After you create a Studio, follow the instructions in Assign a user or group to an EMR Studio.

CLI
Note

Linux line continuation characters (\) are included for readability. They can be removed or used in Linux commands. For Windows, remove them or replace with a caret (^).

Example CLI command to create an EMR Studio with IAM authentication mode

The following example AWS CLI command creates an EMR Studio with IAM authentication mode. When you use IAM authentication or federation for the Studio, you don't specify a --user-role.

To let federated users log in using the Studio URL and credentials for your identity provider (IdP), specify your --idp-auth-url and --idp-relay-state-parameter-name. For a list of IdP authentication URLs and RelayState names, see Identity provider RelayState parameters and authentication URLs.

aws emr create-studio \ --name <example-studio-name> \ --auth-mode IAM \ --vpc-id <example-vpc-id> \ --subnet-ids <subnet-id-1> <subnet-id-2>... <subnet-id-5> \ --service-role <example-studio-service-role-name> \ --workspace-security-group-id <example-workspace-sg-id> \ --engine-security-group-id <example-engine-sg-id> \ --default-s3-location <example-s3-location> \ --idp-auth-url <https://EXAMPLE/login/> \ --idp-relay-state-parameter-name <example-RelayState>

Example CLI command to create an EMR Studio with AWS SSO authentication mode

The following AWS CLI example command creates an EMR Studio that uses AWS SSO authentication mode. When you use AWS SSO authentication, you must specify a --user-role.

For more information about AWS SSO authentication mode, see Set up SSO authentication mode for Amazon EMR Studio.

aws emr create-studio \ --name <example-studio-name> \ --auth-mode SSO \ --vpc-id <example-vpc-id> \ --subnet-ids <subnet-id-1> <subnet-id-2>... <subnet-id-5> \ --service-role <example-studio-service-role-name> \ --user-role <example-studio-user-role-name> \ --workspace-security-group-id <example-workspace-sg-id> \ --engine-security-group-id <example-engine-sg-id> \ --default-s3-location <example-s3-location>

Example CLI output for aws emr create-studio

The following is an example of the output that appears after you create a Studio.

{ StudioId: "es-123XXXXXXXXX", Url: "https://es-123XXXXXXXXX.emrstudio-prod.us-east-1.amazonaws.com" }

For more information about the create-studio command, see AWS CLI Command Reference.

Identity provider RelayState parameters and authentication URLs

When you use IAM federation, and you want users to log in using your Studio URL and credentials for your identity provider (IdP), you can specify your Identity provider (IdP) login URL and RelayState parameter name when you Create an EMR Studio.

The following table shows the standard authentication URL and RelayState parameter name for some popular identity providers.

Identity provider Parameter Authentication URL
Auth0 RelayState https://<sub_domain>.auth0.com/samlp/<app_id>
Google accounts RelayState https://accounts.google.com/o/saml2/initsso?idpid=<idp_id>&spid=<sp_id>&forceauthn=false
Microsoft Azure RelayState https://myapps.microsoft.com/signin/<app_name>/<app_id>?tenantId=<tenant_id>
Okta RelayState https://<sub_domain>.okta.com/app/<app_name>/<app_id>/sso/saml
PingFederate TargetResource https://<host>/idp/<idp_id>/startSSO.ping?PartnerSpId=<sp_id>
PingOne TargetResource https://sso.connect.pingidentity.com/sso/sp/initsso?saasid=<app_id>&idpid=<idp_id>