Custom onboarding to Amazon SageMaker Domain using IAM - Amazon SageMaker

Custom onboarding to Amazon SageMaker Domain using IAM

This topic describes how to onboard to Amazon SageMaker Domain using the Set up for organizations procedure for AWS Identity and Access Management (IAM) authentication from the SageMaker console or the AWS CLI. To onboard faster using IAM, see Quick onboarding.

For information on how to onboard using AWS IAM Identity Center (IAM Identity Center), see Custom onboarding using IAM Identity Center.

Onboard using console

To onboard to Domain using IAM
  1. Open the SageMaker console.

  2. On the left navigation pane, choose Admin configurations.

  3. Under Admin configurations, choose Domains.

  4. From the Domains page, choose Create domain.

  5. On the Setup SageMaker Domain page, choose Set up for organizations.

  6. Select Configure.

Step 1: Domain details

  1. For Domain Name, enter a unique name for your Domain. For example, this can be your project or team name.

  2. Choose Next.

Step 2: Users and ML Activities

Select the group or create the users for the Domain and grant permissions to which ML activities they will have access.

In these setup instructions, we use the Login through IAM option.

The IAM role you configure in this step is assigned to all of the users you add in this step.

  1. Under How do you want to access Studio?, choose Login through IAM.

  2. Under Who will use Studio? add the user profile names. To add a user profile name, choose Add user, enter a user profile name, then choose Select.

  3. Under What ML activities do they perform? you can use an existing role by choosing Use an existing role or you can create a new role by choosing Create a new role and checking the ML activities to which you wish the role to have access. You can select at most 10 ML activities.

  4. While selecting ML activities, you may need to satisfy requirements. To satisfy a requirement, choose Add and complete the requirement.

  5. After all requirements are satisfied, choose Next.

Step 3: Applications

In this step, you can configure the applications you have enabled in the previous step. For more information on the ML activities, see ML activity reference.

If the application has not been enabled, you receive a warning for that application. To enable an application that has not been enabled, return to the previous step by choosing Back and follow the previous instructions.

SageMaker Studio configuration:

Under SageMaker Studio, you have the option to choose between the new and classic version of Studio as your default experience. This means choosing which ML environment you will interact with after opening Studio.

  • SageMaker Studio - New includes multiple integrated development environments (IDEs) and applications, including Amazon SageMaker Studio Classic. If chosen, the Studio Classic IDE has default settings. For information on the default settings, see Default settings.

  • SageMaker Studio Classic includes the Jupyter IDE. If chosen, you may configure your Studio Classic configuration.

    For information on Studio Classic, see Amazon SageMaker Studio Classic.

SageMaker Canvas configuration:

If you have Amazon SageMaker Canvas enabled, see Getting started with using Amazon SageMaker Canvas for the instructions and configuration details for onboarding.

SageMaker Studio Classic configuration:

If you chose SageMaker Studio - New (recommended) as your default experience, the Studio Classic IDE has default settings. For information on the default settings, see Default settings.

If you chose Studio Classic as your default experience, you can choose to enable or disable notebook resource sharing. Notebook resources include artifacts such as cell output and Git repositories. For more information on Notebook resources, see Share and Use an Amazon SageMaker Studio Classic Notebook.

If you enabled notebook resource sharing:

  1. Under S3 location for shareable notebook resources, input your Amazon S3 location.

  2. Under Encryption key - optional, leave as No Custom Encryption or choose an existing AWS KMS key or choose Enter a KMS key ARN and enter your AWS KMS key's ARN.

  3. Under Notebook cell output sharing preference, choose Allow users to share cell output or Disable cell output sharing.

RStudio configuration:

To enable RStudio you will need an RStudio license. To set that up, see RStudio license.

  1. Under RStudio Workbench, verify that your RStudio license is automatically detected. For more information about getting an RStudio license and activating it with SageMaker, see RStudio license.

  2. Select an instance type to launch your RStudio Server on. For more information, see RStudioServerPro instance type.

  3. Under Permission, create your role or select an existing role. The role must have the following permissions policy. This policy allows the RStudioServerPro application to access necessary resources. It also allows Amazon SageMaker to automatically launch an RStudioServerPro app when the existing RStudioServerPro application is in a Deleted or Failed status. For information about adding permissions to a role, see Modifying a role permissions policy (console).

    { "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "license-manager:ExtendLicenseConsumption", "license-manager:ListReceivedLicenses", "license-manager:GetLicense", "license-manager:CheckoutLicense", "license-manager:CheckInLicense", "logs:CreateLogDelivery", "logs:CreateLogGroup", "logs:CreateLogStream", "logs:DeleteLogDelivery", "logs:Describe*", "logs:GetLogDelivery", "logs:GetLogEvents", "logs:ListLogDeliveries", "logs:PutLogEvents", "logs:PutResourcePolicy", "logs:UpdateLogDelivery", "sagemaker:CreateApp" ], "Resource": "*" } ] }
  4. Under RStudio Connect, add the URL for your RStudio Connect server. RStudio Connect is a publishing platform for Shiny applications, R Markdown reports, dashboards, plots, and more. When you onboard to RStudio on SageMaker, an RStudio Connect server is not created. For more information, see RStudio Connect URL.

  5. Under RStudio Package Manager, add the URL for your RStudio Package Manager. SageMaker creates a default package repository for the Package Manager when you onboard RStudio. For more information about RStudio Package Manager, see RStudio Package Manager.

  6. Select Next.

Code Editor configuration:

If you have Code Editor enabled, see Code Editor for an overview and the configuration details.

Step 4: Network

Choose how you want Studio to connect to other AWS services.

You can choose to disable internet access to your Studio by specifying using Virtual Private Cloud (VPC) Only network access type. If you choose this option, you cannot run a Studio notebook unless your VPC has an interface endpoint to the SageMaker API and runtime, or a Network Address Translation (NAT) gateway with internet access, and your security groups allow outbound connections. For more information on Amazon VPCs, see Choose an Amazon VPC.

If you choose Virtual Private Cloud (VPC) Only the following steps are required. If you choose Public internet access, the first two of the following steps are required.

  1. Under VPC, choose the Amazon VPC ID.

  2. Under Subnet, choose one or more subnets. If you don't choose any subnets, SageMaker uses all the subnets in the Amazon VPC. We recommend that you use multiple subnets that are not created in constrained Availability Zones. Using subnets in these constrained Availability Zones can result in insufficient capacity errors and longer application creation times. For more information about constrained Availability Zones, see Availability Zones.

  3. Under Security group(s), choose one or more subnets.

If VPC only is selected, SageMaker automatically applies the security group settings defined for the Domain to all shared spaces created in the Domain. If Public internet only is selected, SageMaker does not apply the security group settings to shared spaces created in the Domain.

Step 5: Storage

You have the option to encrypt your data. The Amazon Elastic File System (Amazon EFS) and Amazon Elastic Block Store (Amazon EBS) file systems that are created for you when you create a Domain. Amazon EBS sizes are used by both Code Editor and JupyterLab spaces.

You cannot change the encryption key after encrypt your Amazon EFS and Amazon EBS file systems. To encrypt your Amazon EFS and Amazon EBS file systems, you can use the following configurations.

  • Under Encryption key - optional, leave as No Custom Encryption or choose an existing KMS key or choose Enter a KMS key ARN and enter the ARN of your KMS key.

  • Under Default space size - optional, enter the default space size.

  • Under Maximum space size - optional, enter the maximum space size.

Step 6: Review and create

Review your Domain settings. If you need to change the settings, choose Edit next to the relevant step. Once you confirm that your Domain settings are accurate, choose Submit and the Domain is created for you. This process may take a few minutes.

Onboard using the AWS CLI

Use the following commands to onboard to a Domain using authentication using IAM from the AWS CLI.

  1. Create an execution role that is used to create a Domain and attach the AmazonSageMakerFullAccess policy. You can also use an existing role that has, at a minimum, an attached trust policy that grants SageMaker permission to assume the role. For more information, see SageMaker Roles.

    aws iam create-role --role-name execution-role-name aws iam attach-role-policy --role-name execution-role-name --policy-arn arn:aws:iam::aws:policy/AmazonSageMakerFullAccess
  2. Get the default Amazon Virtual Private Cloud (Amazon VPC) of your account.

    aws --region region ec2 describe-vpcs --filters Name=isDefault,Values=true --query "Vpcs[0].VpcId" --output text
  3. Get the list of subnets in the default Amazon VPC.

    aws --region region ec2 describe-subnets --filters Name=vpc-id,Values=default-vpc-id --query "Subnets[*].SubnetId" --output json
  4. Create a Domain by passing the default Amazon VPC ID, subnets, and execution role ARN. You must also pass a SageMaker image ARN. For information on the available JupyterLab version ARNs, see Setting a default JupyterLab version.

    aws --region region sagemaker create-domain --domain-name domain-name --vpc-id default-vpc-id --subnet-ids subnet-ids --auth-mode IAM --default-user-settings "ExecutionRole=arn:aws:iam::account-number:role/execution-role-name,JupyterServerAppSettings={DefaultResourceSpec={InstanceType=system,SageMakerImageArn=image-arn}}" \ --query DomainArn --output text
  5. Verify that the Domain has been created.

    aws --region region sagemaker list-domains