@Stability(Experimental) package software.amazon.awscdk.services.sagemaker.alpha

Amazon SageMaker Construct Library

---

cdk-constructs: Experimental

The APIs of higher level constructs in this module are experimental and under active development. They are subject to non-backward compatible changes or removal in any future version. These are not subject to the Semantic Versioning model and breaking changes will be announced in the release notes. This means that while you may use them, you may need to update your source code when upgrading to a newer version of this package.

Amazon SageMaker provides every developer and data scientist with the ability to build, train, and deploy machine learning models quickly. Amazon SageMaker is a fully-managed service that covers the entire machine learning workflow to label and prepare your data, choose an algorithm, train the model, tune and optimize it for deployment, make predictions, and take action. Your models get to production faster with much less effort and lower cost.

Model

To create a machine learning model with Amazon Sagemaker, use the Model construct. This construct includes properties that can be configured to define model components, including the model inference code as a Docker image and an optional set of separate model data artifacts. See the AWS documentation to learn more about SageMaker models.

Single Container Model

In the event that a single container is sufficient for your inference use-case, you can define a single-container model:

 import software.amazon.awscdk.services.sagemaker.alpha.*;
 import path.*;
 
 
 ContainerImage image = ContainerImage.fromAsset(join("path", "to", "Dockerfile", "directory"));
 ModelData modelData = ModelData.fromAsset(join("path", "to", "artifact", "file.tar.gz"));
 
 Model model = Model.Builder.create(this, "PrimaryContainerModel")
         .containers(List.of(ContainerDefinition.builder()
                 .image(image)
                 .modelData(modelData)
                 .build()))
         .build();

Inference Pipeline Model

An inference pipeline is an Amazon SageMaker model that is composed of a linear sequence of multiple containers that process requests for inferences on data. See the AWS documentation to learn more about SageMaker inference pipelines. To define an inference pipeline, you can provide additional containers for your model:

 import software.amazon.awscdk.services.sagemaker.alpha.*;
 
 ContainerImage image1;
 ModelData modelData1;
 ContainerImage image2;
 ModelData modelData2;
 ContainerImage image3;
 ModelData modelData3;
 
 
 Model model = Model.Builder.create(this, "InferencePipelineModel")
         .containers(List.of(ContainerDefinition.builder().image(image1).modelData(modelData1).build(), ContainerDefinition.builder().image(image2).modelData(modelData2).build(), ContainerDefinition.builder().image(image3).modelData(modelData3).build()))
         .build();

Model Properties

Network Isolation

If you enable network isolation, the containers can't make any outbound network calls, even to other AWS services such as Amazon S3. Additionally, no AWS credentials are made available to the container runtime environment.

To enable network isolation, set the networkIsolation property to true:

 import software.amazon.awscdk.services.sagemaker.alpha.*;
 
 ContainerImage image;
 ModelData modelData;
 
 
 Model model = Model.Builder.create(this, "ContainerModel")
         .containers(List.of(ContainerDefinition.builder()
                 .image(image)
                 .modelData(modelData)
                 .build()))
         .networkIsolation(true)
         .build();

Container Images

Inference code can be stored in the Amazon EC2 Container Registry (Amazon ECR), which is specified via ContainerDefinition's image property which accepts a class that extends the ContainerImage abstract base class.

Asset Image

Reference a local directory containing a Dockerfile:

 import software.amazon.awscdk.services.sagemaker.alpha.*;
 import path.*;
 
 
 ContainerImage image = ContainerImage.fromAsset(join("path", "to", "Dockerfile", "directory"));

ECR Image

Reference an image available within ECR:

 import software.amazon.awscdk.services.ecr.*;
 import software.amazon.awscdk.services.sagemaker.alpha.*;
 
 
 IRepository repository = Repository.fromRepositoryName(this, "Repository", "repo");
 ContainerImage image = ContainerImage.fromEcrRepository(repository, "tag");

DLC Image

Reference a deep learning container image:

 import software.amazon.awscdk.services.sagemaker.alpha.*;
 
 
 String repositoryName = "huggingface-pytorch-training";
 String tag = "1.13.1-transformers4.26.0-gpu-py39-cu117-ubuntu20.04";
 
 ContainerImage image = ContainerImage.fromDlc(repositoryName, tag);

Model Artifacts

If you choose to decouple your model artifacts from your inference code (as is natural given different rates of change between inference code and model artifacts), the artifacts can be specified via the modelData property which accepts a class that extends the ModelData abstract base class. The default is to have no model artifacts associated with a model.

Asset Model Data

Reference local model data:

 import software.amazon.awscdk.services.sagemaker.alpha.*;
 import path.*;
 
 
 ModelData modelData = ModelData.fromAsset(join("path", "to", "artifact", "file.tar.gz"));

S3 Model Data

Reference an S3 bucket and object key as the artifacts for a model:

 import software.amazon.awscdk.services.s3.*;
 import software.amazon.awscdk.services.sagemaker.alpha.*;
 
 
 Bucket bucket = new Bucket(this, "MyBucket");
 ModelData modelData = ModelData.fromBucket(bucket, "path/to/artifact/file.tar.gz");

Model Hosting

Amazon SageMaker provides model hosting services for model deployment. Amazon SageMaker provides an HTTPS endpoint where your machine learning model is available to provide inferences.

Endpoint Configuration

By using the EndpointConfig construct, you can define a set of endpoint configuration which can be used to provision one or more endpoints. In this configuration, you identify one or more models to deploy and the resources that you want Amazon SageMaker to provision. You define one or more production variants, each of which identifies a model. Each production variant also describes the resources that you want Amazon SageMaker to provision. If you are hosting multiple models, you also assign a variant weight to specify how much traffic you want to allocate to each model. For example, suppose that you want to host two models, A and B, and you assign traffic weight 2 for model A and 1 for model B. Amazon SageMaker distributes two-thirds of the traffic to Model A, and one-third to model B:

 import software.amazon.awscdk.services.sagemaker.alpha.*;
 
 Model modelA;
 Model modelB;
 
 
 EndpointConfig endpointConfig = EndpointConfig.Builder.create(this, "EndpointConfig")
         .instanceProductionVariants(List.of(InstanceProductionVariantProps.builder()
                 .model(modelA)
                 .variantName("modelA")
                 .initialVariantWeight(2)
                 .build(), InstanceProductionVariantProps.builder()
                 .model(modelB)
                 .variantName("variantB")
                 .initialVariantWeight(1)
                 .build()))
         .build();

Endpoint

When you create an endpoint from an EndpointConfig, Amazon SageMaker launches the ML compute instances and deploys the model or models as specified in the configuration. To get inferences from the model, client applications send requests to the Amazon SageMaker Runtime HTTPS endpoint. For more information about the API, see the InvokeEndpoint API. Defining an endpoint requires at minimum the associated endpoint configuration:

 import software.amazon.awscdk.services.sagemaker.alpha.*;
 
 EndpointConfig endpointConfig;
 
 
 Endpoint endpoint = Endpoint.Builder.create(this, "Endpoint").endpointConfig(endpointConfig).build();

AutoScaling

To enable autoscaling on the production variant, use the autoScaleInstanceCount method:

 import software.amazon.awscdk.services.sagemaker.alpha.*;
 
 Model model;
 
 
 String variantName = "my-variant";
 EndpointConfig endpointConfig = EndpointConfig.Builder.create(this, "EndpointConfig")
         .instanceProductionVariants(List.of(InstanceProductionVariantProps.builder()
                 .model(model)
                 .variantName(variantName)
                 .build()))
         .build();
 
 Endpoint endpoint = Endpoint.Builder.create(this, "Endpoint").endpointConfig(endpointConfig).build();
 IEndpointInstanceProductionVariant productionVariant = endpoint.findInstanceProductionVariant(variantName);
 ScalableInstanceCount instanceCount = productionVariant.autoScaleInstanceCount(EnableScalingProps.builder()
         .maxCapacity(3)
         .build());
 instanceCount.scaleOnInvocations("LimitRPS", InvocationsScalingProps.builder()
         .maxRequestsPerSecond(30)
         .build());

For load testing guidance on determining the maximum requests per second per instance, please see this documentation.

Metrics

To monitor CloudWatch metrics for a production variant, use one or more of the metric convenience methods:

 import software.amazon.awscdk.services.sagemaker.alpha.*;
 
 EndpointConfig endpointConfig;
 
 
 Endpoint endpoint = Endpoint.Builder.create(this, "Endpoint").endpointConfig(endpointConfig).build();
 IEndpointInstanceProductionVariant productionVariant = endpoint.findInstanceProductionVariant("my-variant");
 productionVariant.metricModelLatency().createAlarm(this, "ModelLatencyAlarm", CreateAlarmOptions.builder()
         .threshold(100000)
         .evaluationPeriods(3)
         .build());

Related Packages

Package

Description

software.amazon.awscdk.services.sagemaker

Amazon SageMaker Construct Library
Class

Description

$Module

AcceleratorType

(experimental) Supported Elastic Inference (EI) instance types for SageMaker instance-based production variants.

ContainerDefinition

(experimental) Describes the container, as part of model definition.

ContainerDefinition.Builder

A builder for ContainerDefinition

ContainerDefinition.Jsii$Proxy

An implementation for ContainerDefinition

ContainerImage

(experimental) Constructs for types of container images.

ContainerImageConfig

(experimental) The configuration for creating a container image.

ContainerImageConfig.Builder

A builder for ContainerImageConfig

ContainerImageConfig.Jsii$Proxy

An implementation for ContainerImageConfig

Endpoint

(experimental) Defines a SageMaker endpoint.

Endpoint.Builder

(experimental) A fluent builder for Endpoint.

EndpointAttributes

(experimental) Represents an Endpoint resource defined outside this stack.

EndpointAttributes.Builder

A builder for EndpointAttributes

EndpointAttributes.Jsii$Proxy

An implementation for EndpointAttributes

EndpointConfig

(experimental) Defines a SageMaker EndpointConfig.

EndpointConfig.Builder

(experimental) A fluent builder for EndpointConfig.

EndpointConfigProps

(experimental) Construction properties for a SageMaker EndpointConfig.

EndpointConfigProps.Builder

A builder for EndpointConfigProps

EndpointConfigProps.Jsii$Proxy

An implementation for EndpointConfigProps

EndpointProps

(experimental) Construction properties for a SageMaker Endpoint.

EndpointProps.Builder

A builder for EndpointProps

EndpointProps.Jsii$Proxy

An implementation for EndpointProps

IEndpoint

(experimental) The Interface for a SageMaker Endpoint resource.

IEndpoint.Jsii$Default

Internal default implementation for IEndpoint.

IEndpoint.Jsii$Proxy

A proxy class which represents a concrete javascript instance of this type.

IEndpointConfig

(experimental) The interface for a SageMaker EndpointConfig resource.

IEndpointConfig.Jsii$Default

Internal default implementation for IEndpointConfig.

IEndpointConfig.Jsii$Proxy

A proxy class which represents a concrete javascript instance of this type.

IEndpointInstanceProductionVariant

(experimental) Represents an instance production variant that has been associated with an endpoint.

IEndpointInstanceProductionVariant.Jsii$Default

Internal default implementation for IEndpointInstanceProductionVariant.

IEndpointInstanceProductionVariant.Jsii$Proxy

A proxy class which represents a concrete javascript instance of this type.

IModel

(experimental) Interface that defines a Model resource.

IModel.Jsii$Default

Internal default implementation for IModel.

IModel.Jsii$Proxy

A proxy class which represents a concrete javascript instance of this type.

InstanceProductionVariantProps

(experimental) Construction properties for an instance production variant.

InstanceProductionVariantProps.Builder

A builder for InstanceProductionVariantProps

InstanceProductionVariantProps.Jsii$Proxy

An implementation for InstanceProductionVariantProps

InstanceType

(experimental) Supported instance types for SageMaker instance-based production variants.

InvocationHttpResponseCode

(experimental) HTTP response codes for Endpoint invocations.

InvocationsScalingProps

(experimental) Properties for enabling SageMaker Endpoint utilization tracking.

InvocationsScalingProps.Builder

A builder for InvocationsScalingProps

InvocationsScalingProps.Jsii$Proxy

An implementation for InvocationsScalingProps

Model

(experimental) Defines a SageMaker Model.

Model.Builder

(experimental) A fluent builder for Model.

ModelAttributes

(experimental) Represents a Model resource defined outside this stack.

ModelAttributes.Builder

A builder for ModelAttributes

ModelAttributes.Jsii$Proxy

An implementation for ModelAttributes

ModelData

(experimental) Model data represents the source of model artifacts, which will ultimately be loaded from an S3 location.

ModelDataConfig

(experimental) The configuration needed to reference model artifacts.

ModelDataConfig.Builder

A builder for ModelDataConfig

ModelDataConfig.Jsii$Proxy

An implementation for ModelDataConfig

ModelProps

(experimental) Construction properties for a SageMaker Model.

ModelProps.Builder

A builder for ModelProps

ModelProps.Jsii$Proxy

An implementation for ModelProps

ScalableInstanceCount

(experimental) A scalable sagemaker endpoint attribute.

ScalableInstanceCount.Builder

(experimental) A fluent builder for ScalableInstanceCount.

ScalableInstanceCountProps

(experimental) The properties of a scalable attribute representing task count.

ScalableInstanceCountProps.Builder

A builder for ScalableInstanceCountProps

ScalableInstanceCountProps.Jsii$Proxy

An implementation for ScalableInstanceCountProps

Package software.amazon.awscdk.services.sagemaker.alpha