Namespace Amazon.CDK.AWS.Sagemaker.Alpha
Amazon SageMaker Construct Library
---The APIs of higher level constructs in this module are experimental and under active development.
They are subject to non-backward compatible changes or removal in any future version. These are
not subject to the <a href="https://semver.org/">Semantic Versioning</a> model and breaking changes will be
announced in the release notes. This means that while you may use them, you may need to update
your source code when upgrading to a newer version of this package.
Amazon SageMaker provides every developer and data scientist with the ability to build, train, and deploy machine learning models quickly. Amazon SageMaker is a fully-managed service that covers the entire machine learning workflow to label and prepare your data, choose an algorithm, train the model, tune and optimize it for deployment, make predictions, and take action. Your models get to production faster with much less effort and lower cost.
Model
To create a machine learning model with Amazon Sagemaker, use the Model
construct. This construct
includes properties that can be configured to define model components, including the model inference
code as a Docker image and an optional set of separate model data artifacts. See the AWS
documentation
to learn more about SageMaker models.
Single Container Model
In the event that a single container is sufficient for your inference use-case, you can define a single-container model:
using Amazon.CDK.AWS.Sagemaker.Alpha;
using Path;
var image = ContainerImage.FromAsset(Join("path", "to", "Dockerfile", "directory"));
var modelData = ModelData.FromAsset(Join("path", "to", "artifact", "file.tar.gz"));
var model = new Model(this, "PrimaryContainerModel", new ModelProps {
Containers = new [] { new ContainerDefinition {
Image = image,
ModelData = modelData
} }
});
Inference Pipeline Model
An inference pipeline is an Amazon SageMaker model that is composed of a linear sequence of multiple containers that process requests for inferences on data. See the AWS documentation to learn more about SageMaker inference pipelines. To define an inference pipeline, you can provide additional containers for your model:
using Amazon.CDK.AWS.Sagemaker.Alpha;
ContainerImage image1;
ModelData modelData1;
ContainerImage image2;
ModelData modelData2;
ContainerImage image3;
ModelData modelData3;
var model = new Model(this, "InferencePipelineModel", new ModelProps {
Containers = new [] { new ContainerDefinition { Image = image1, ModelData = modelData1 }, new ContainerDefinition { Image = image2, ModelData = modelData2 }, new ContainerDefinition { Image = image3, ModelData = modelData3 } }
});
Model Properties
Network Isolation
If you enable network isolation, the containers can't make any outbound network calls, even to other AWS services such as Amazon S3. Additionally, no AWS credentials are made available to the container runtime environment.
To enable network isolation, set the networkIsolation
property to true
:
using Amazon.CDK.AWS.Sagemaker.Alpha;
ContainerImage image;
ModelData modelData;
var model = new Model(this, "ContainerModel", new ModelProps {
Containers = new [] { new ContainerDefinition {
Image = image,
ModelData = modelData
} },
NetworkIsolation = true
});
Container Images
Inference code can be stored in the Amazon EC2 Container Registry (Amazon ECR), which is specified
via ContainerDefinition
's image
property which accepts a class that extends the ContainerImage
abstract base class.
Asset Image
Reference a local directory containing a Dockerfile:
using Amazon.CDK.AWS.Sagemaker.Alpha;
using Path;
var image = ContainerImage.FromAsset(Join("path", "to", "Dockerfile", "directory"));
ECR Image
Reference an image available within ECR:
using Amazon.CDK.AWS.ECR;
using Amazon.CDK.AWS.Sagemaker.Alpha;
var repository = Repository.FromRepositoryName(this, "Repository", "repo");
var image = ContainerImage.FromEcrRepository(repository, "tag");
DLC Image
Reference a deep learning container image:
using Amazon.CDK.AWS.Sagemaker.Alpha;
var repositoryName = "huggingface-pytorch-training";
var tag = "1.13.1-transformers4.26.0-gpu-py39-cu117-ubuntu20.04";
var image = ContainerImage.FromDlc(repositoryName, tag);
Model Artifacts
If you choose to decouple your model artifacts from your inference code (as is natural given
different rates of change between inference code and model artifacts), the artifacts can be
specified via the modelData
property which accepts a class that extends the ModelData
abstract
base class. The default is to have no model artifacts associated with a model.
Asset Model Data
Reference local model data:
using Amazon.CDK.AWS.Sagemaker.Alpha;
using Path;
var modelData = ModelData.FromAsset(Join("path", "to", "artifact", "file.tar.gz"));
S3 Model Data
Reference an S3 bucket and object key as the artifacts for a model:
using Amazon.CDK.AWS.S3;
using Amazon.CDK.AWS.Sagemaker.Alpha;
var bucket = new Bucket(this, "MyBucket");
var modelData = ModelData.FromBucket(bucket, "path/to/artifact/file.tar.gz");
Model Hosting
Amazon SageMaker provides model hosting services for model deployment. Amazon SageMaker provides an HTTPS endpoint where your machine learning model is available to provide inferences.
Endpoint Configuration
By using the EndpointConfig
construct, you can define a set of endpoint configuration which can be
used to provision one or more endpoints. In this configuration, you identify one or more models to
deploy and the resources that you want Amazon SageMaker to provision. You define one or more
production variants, each of which identifies a model. Each production variant also describes the
resources that you want Amazon SageMaker to provision. If you are hosting multiple models, you also
assign a variant weight to specify how much traffic you want to allocate to each model. For example,
suppose that you want to host two models, A and B, and you assign traffic weight 2 for model A and 1
for model B. Amazon SageMaker distributes two-thirds of the traffic to Model A, and one-third to
model B:
using Amazon.CDK.AWS.Sagemaker.Alpha;
Model modelA;
Model modelB;
var endpointConfig = new EndpointConfig(this, "EndpointConfig", new EndpointConfigProps {
InstanceProductionVariants = new [] { new InstanceProductionVariantProps {
Model = modelA,
VariantName = "modelA",
InitialVariantWeight = 2
}, new InstanceProductionVariantProps {
Model = modelB,
VariantName = "variantB",
InitialVariantWeight = 1
} }
});
Endpoint
When you create an endpoint from an EndpointConfig
, Amazon SageMaker launches the ML compute
instances and deploys the model or models as specified in the configuration. To get inferences from
the model, client applications send requests to the Amazon SageMaker Runtime HTTPS endpoint. For
more information about the API, see the
InvokeEndpoint
API. Defining an endpoint requires at minimum the associated endpoint configuration:
using Amazon.CDK.AWS.Sagemaker.Alpha;
EndpointConfig endpointConfig;
var endpoint = new Endpoint(this, "Endpoint", new EndpointProps { EndpointConfig = endpointConfig });
AutoScaling
To enable autoscaling on the production variant, use the autoScaleInstanceCount
method:
using Amazon.CDK.AWS.Sagemaker.Alpha;
Model model;
var variantName = "my-variant";
var endpointConfig = new EndpointConfig(this, "EndpointConfig", new EndpointConfigProps {
InstanceProductionVariants = new [] { new InstanceProductionVariantProps {
Model = model,
VariantName = variantName
} }
});
var endpoint = new Endpoint(this, "Endpoint", new EndpointProps { EndpointConfig = endpointConfig });
var productionVariant = endpoint.FindInstanceProductionVariant(variantName);
var instanceCount = productionVariant.AutoScaleInstanceCount(new EnableScalingProps {
MaxCapacity = 3
});
instanceCount.ScaleOnInvocations("LimitRPS", new InvocationsScalingProps {
MaxRequestsPerSecond = 30
});
For load testing guidance on determining the maximum requests per second per instance, please see this documentation.
Metrics
To monitor CloudWatch metrics for a production variant, use one or more of the metric convenience methods:
using Amazon.CDK.AWS.Sagemaker.Alpha;
EndpointConfig endpointConfig;
var endpoint = new Endpoint(this, "Endpoint", new EndpointProps { EndpointConfig = endpointConfig });
var productionVariant = endpoint.FindInstanceProductionVariant("my-variant");
productionVariant.MetricModelLatency().CreateAlarm(this, "ModelLatencyAlarm", new CreateAlarmOptions {
Threshold = 100000,
EvaluationPeriods = 3
});
Classes
AcceleratorType | (experimental) Supported Elastic Inference (EI) instance types for SageMaker instance-based production variants. |
ContainerDefinition | (experimental) Describes the container, as part of model definition. |
ContainerImage | (experimental) Constructs for types of container images. |
ContainerImageConfig | (experimental) The configuration for creating a container image. |
Endpoint | (experimental) Defines a SageMaker endpoint. |
EndpointAttributes | (experimental) Represents an Endpoint resource defined outside this stack. |
EndpointConfig | (experimental) Defines a SageMaker EndpointConfig. |
EndpointConfigProps | (experimental) Construction properties for a SageMaker EndpointConfig. |
EndpointProps | (experimental) Construction properties for a SageMaker Endpoint. |
InstanceProductionVariantProps | (experimental) Construction properties for an instance production variant. |
InstanceType | (experimental) Supported instance types for SageMaker instance-based production variants. |
InvocationHttpResponseCode | (experimental) HTTP response codes for Endpoint invocations. |
InvocationsScalingProps | (experimental) Properties for enabling SageMaker Endpoint utilization tracking. |
Model | (experimental) Defines a SageMaker Model. |
ModelAttributes | (experimental) Represents a Model resource defined outside this stack. |
ModelData | (experimental) Model data represents the source of model artifacts, which will ultimately be loaded from an S3 location. |
ModelDataConfig | (experimental) The configuration needed to reference model artifacts. |
ModelProps | (experimental) Construction properties for a SageMaker Model. |
ScalableInstanceCount | (experimental) A scalable sagemaker endpoint attribute. |
ScalableInstanceCountProps | (experimental) The properties of a scalable attribute representing task count. |
Interfaces
IContainerDefinition | (experimental) Describes the container, as part of model definition. |
IContainerImageConfig | (experimental) The configuration for creating a container image. |
IEndpoint | (experimental) The Interface for a SageMaker Endpoint resource. |
IEndpointAttributes | (experimental) Represents an Endpoint resource defined outside this stack. |
IEndpointConfig | (experimental) The interface for a SageMaker EndpointConfig resource. |
IEndpointConfigProps | (experimental) Construction properties for a SageMaker EndpointConfig. |
IEndpointInstanceProductionVariant | (experimental) Represents an instance production variant that has been associated with an endpoint. |
IEndpointProps | (experimental) Construction properties for a SageMaker Endpoint. |
IInstanceProductionVariantProps | (experimental) Construction properties for an instance production variant. |
IInvocationsScalingProps | (experimental) Properties for enabling SageMaker Endpoint utilization tracking. |
IModel | (experimental) Interface that defines a Model resource. |
IModelAttributes | (experimental) Represents a Model resource defined outside this stack. |
IModelDataConfig | (experimental) The configuration needed to reference model artifacts. |
IModelProps | (experimental) Construction properties for a SageMaker Model. |
IScalableInstanceCountProps | (experimental) The properties of a scalable attribute representing task count. |