Local mode support in Amazon SageMaker Studio - Amazon SageMaker

Local mode support in Amazon SageMaker Studio

Important

Custom IAM policies that allow Amazon SageMaker Studio or Amazon SageMaker Studio Classic to create Amazon SageMaker resources must also grant permissions to add tags to those resources. The permission to add tags to resources is required because Studio and Studio Classic automatically tag any resources they create. If an IAM policy allows Studio and Studio Classic to create resources but does not allow tagging, "AccessDenied" errors can occur when trying to create resources. For more information, see Provide Permissions for Tagging SageMaker Resources.

AWS Managed Policies for Amazon SageMaker that give permissions to create SageMaker resources already include permissions to add tags while creating those resources.

Amazon SageMaker Studio applications support the use of local mode to create estimators, processors, and pipelines, then deploy them to a local environment. With local mode, you can test machine learning scripts before running them in Amazon SageMaker managed training or hosting environments. Studio supports local mode in the following applications:

  • Amazon SageMaker Studio Classic

  • JupyterLab

  • Code Editor, based on Code-OSS, Visual Studio Code - Open Source

Local mode in Studio applications is invoked using the SageMaker Python SDK. In Studio applications, local mode functions similarly to how it functions in Amazon SageMaker notebook instances, with some differences. For more information about using local mode with the SageMaker Python SDK, see Local Mode.

Note

Studio applications do not support multi-container jobs in local mode. Local mode jobs are limited to a single instance for training, inference, and processing jobs. When creating a local mode job, the instance count configuration must be 1

As part of local mode support, Studio applications support limited Docker access capabilities. With this support, users can interact with the Docker API from Jupyter notebooks or the image terminal of the application. Customers can interact with Docker using one of the following:

Prerequisites

Complete the following prerequisites to use local mode in Studio applications:

  • To pull images from an Amazon Elastic Container Registry repository, the account hosting the Amazon ECR image must provide access permission for the user’s execution role. The domain’s execution role must also allow Amazon ECR access.

  • Verify that you are using the latest version of the Studio Python SDK by using the following command: 

    pip install -U sagemaker
  • To use local mode and Docker capabilities, set the following parameter of the domain’s DockerSettings using the AWS Command Line Interface (AWS CLI): 

    EnableDockerAccess : ENABLED
  • Using EnableDockerAccess, you can also control whether users in the domain can use local mode. By default, local mode and Docker capabilities aren't allowed in Studio applications. For more information, see Setting EnableDockerAccess.

  • Install the Docker CLI in the Studio application by following the steps in Docker installation.

Setting EnableDockerAccess

The following sections show how to set EnableDockerAccess when the domain has public internet access or is in VPC-only mode.

Note

Changes to EnableDockerAccess only apply to applications created after the domain is updated. You must create a new application after updating the domain.

Public internet access

The following example commands show how to set EnableDockerAccess when creating a new domain or updating an existing domain with public internet access:

# create new domain aws --region region \ sagemaker create-domain --domain-name domain-name \ --vpc-id vpc-id \ --subnet-ids subnet-ids \ --auth-mode IAM \ --default-user-settings "ExecutionRole=execution-role" \ --domain-settings '{"DockerSettings": {"EnableDockerAccess": "ENABLED"}}' \ --query DomainArn \ --output text # update domain aws --region region \ sagemaker update-domain --domain-id domain-id \ --domain-settings-for-update '{"DockerSettings": {"EnableDockerAccess": "ENABLED"}}'

VPC-only mode

When using a domain in VPC-only mode, Docker image push and pull requests are routed through the service VPC instead of the VPC configured by the customer. Because of this functionality, administrators can configure a list of trusted AWS accounts that users can make Amazon ECR Docker pull and push operations requests to.

If a Docker image push or pull request is made to an AWS account that is not in the list of trusted AWS accounts, the request fails. Docker pull and push operations outside of Amazon Elastic Container Registry (Amazon ECR) aren't supported in VPC-only mode.

The following AWS accounts are trusted by default:

  • The account hosting the SageMaker domain.

  • SageMaker accounts that host the following SageMaker images:

    • DLC framework images

    • Sklearn, Spark, XGBoost processing images

To configure a list of additional trusted AWS accounts, specify the VpcOnlyTrustedAccounts value as follows:

aws --region region \ sagemaker update-domain --domain-id domain-id \ --domain-settings-for-update '{"DockerSettings": {"EnableDockerAccess": "ENABLED", "VpcOnlyTrustedAccounts": ["account-list"]}}'

Docker support

Studio also supports limited Docker access capabilities with the following restrictions:

  • Usage of Docker networks is not supported.

  • Docker volume usage is not supported during container run. Only volume bind mount inputs are allowed during container orchestration. The volume bind mount inputs must be located on the Amazon Elastic File System (Amazon EFS) volume for Studio Classic. For JupyterLab and Code Editor applications, it must be located on the Amazon Elastic Block Store (Amazon EBS) volume.

  • Container inspect operations are allowed.

  • Container port to host mapping is not allowed. However, you can specify a port for hosting. The endpoint is then accessible from Studio using the following URL:

    http://localhost:port

Docker operations supported

The following table lists all of the Docker API endpoints that are supported in Studio, including any support limitations. If an API endpoint is missing from the table, Studio doesn't support it.

API Documentation Limitations
SystemAuth
SystemEvents
SystemVersion
SystemPing
SystemPingHead
ContainerCreate
  • Containers cannot be run in Docker default bridge or custom Docker networks. Containers are run in the same network as the Studio application container.

  • Users can only use the following value for the network name: sagemaker. For example:

    docker run --net sagemaker parameter-values
  • Only bind mounts are allowed for volume usage. The host directory should exist on Amazon EFS for KernelGateway applications or Amazon EBS for other applications.

  • Containers cannot run in privileged mode or with elevated secure computing permissions.

ContainerStart
ContainerStop
ContainerKill
ContainerDelete
ContainerList
ContainerLogs
ContainerInspect
ContainerWait
ContainerAttach
ContainerPrune
ContainerResize
ImageCreate VPC-only mode support is limited to Amazon ECR images in allowlisted accounts.
ImagePrune
ImagePush VPC-only mode support is limited to Amazon ECR images in allowlisted accounts.
ImageList
ImageInspect
ImageGet
ImageDelete
ImageBuild
  • VPC-only mode support is limited to Amazon ECR images in allowlisted accounts.

  • Users can only use the following value for the network name: sagemaker. For example:

    docker build --network sagemaker parameter-values

Docker installation

To use Docker, you must manually install Docker from the terminal of your Studio application. The steps to install Docker are different if the domain has access to the internet or not.

Internet access

If the domain is created with public internet access or in VPC-only mode with limited internet access, use the following steps to install Docker.

  1. (Optional) If your domain is created in VPC-only mode with limited internet access, create a public NAT gateway with access to the Docker website. For instructions, see NAT gateways.

  2. Navigate to the terminal of the Studio application that you want to install Docker in.

  3. To return the operating system of the application, run the following command from the terminal:

    cat /etc/os-release
  4. Install Docker following the instructions for the operating system of the application in the Amazon SageMaker Local Mode Examples repository.

    For example, install Docker on Ubuntu following the script at https://github.com/aws-samples/amazon-sagemaker-local-mode/blob/main/sagemaker_studio_docker_cli_install/sagemaker-ubuntu-focal-docker-cli-install.sh with the following considerations:

    • If chained commands fail, run commands one at a time.

    • Studio only supports Docker version 20.10.X. and Docker Engine API version 1.41.

    • The following packages aren't required to use the Docker CLI in Studio and their installation can be skipped:

      • containerd.io

      • docker-ce

      • docker-buildx-plugin

    Note

    You do not need to start the Docker service in your applications. The instance that hosts the Studio application runs Docker service by default. All Docker API calls are routed through the Docker service automatically.

  5. Use the exposed Docker socket for Docker interactions within Studio applications. By default, the following socket is exposed:

    unix:///docker/proxy.sock

    The following Studio application environmental variable for the default USER uses this exposed socket:

    DOCKER_HOST

No internet access

If the domain is created in VPC-only mode with no internet access, use the following steps to install Docker.

  1. Navigate to the terminal of the Studio application that you want to install Docker in.

  2. Run the following command from the terminal to return the operating system of the application:

    cat /etc/os-release
  3. Download the required Docker .deb files to your local machine. For instructions about downloading the required files for the operating system of the Studio application, see Install Docker Engine.

    For example, install Docker from a package on Ubuntu following the steps 1–4 in Install from a package with the following considerations:

    • Install Docker from a package. Using other methods to install Docker will fail.

    • Install the latest packages corresponding to Docker version 20.10.X.

    • The following packages aren't required to use the Docker CLI in Studio. You don't need to install the following:

      • containerd.io

      • docker-ce

      • docker-buildx-plugin

    Note

    You do not need to start the Docker service in your applications. The instance that hosts the Studio application runs Docker service by default. All Docker API calls are routed through the Docker service automatically.

  4. Upload the .deb files to the Amazon EFS file system or to the Amazon EBS file system of the application.

  5. Manually install the docker-ce-cli and docker-compose-plugin .deb packages from the Studio application terminal. For more information and instructions, see step 5 in Install from a package on the Docker docs website.

  6. Use the exposed Docker socket for Docker interactions within Studio applications. By default, the following socket is exposed:

    unix:///docker/proxy.sock

    The following Studio application environmental variable for the default USER uses this exposed socket:

    DOCKER_HOST