Table Of Contents


User Guide

First time using the AWS CLI? See the User Guide for help getting started.

[ aws . sagemaker ]



Creates an endpoint configuration that Amazon SageMaker hosting services uses to deploy models. In the configuration, you identify one or more models, created using the CreateModel API, to deploy and the resources that you want Amazon SageMaker to provision. Then you call the CreateEndpoint API.


Use this API only if you want to use Amazon SageMaker hosting services to deploy models into production.

In the request, you define one or more ProductionVariant s, each of which identifies a model. Each ProductionVariant parameter also describes the resources that you want Amazon SageMaker to provision. This includes the number and type of ML compute instances to deploy.

If you are hosting multiple models, you also assign a VariantWeight to specify how much traffic you want to allocate to each model. For example, suppose that you want to host two models, A and B, and you assign traffic weight 2 for model A and 1 for model B. Amazon SageMaker distributes two-thirds of the traffic to Model A, and one-third to model B.

See also: AWS API Documentation

See 'aws help' for descriptions of global parameters.


--endpoint-config-name <value>
--production-variants <value>
[--tags <value>]
[--kms-key-id <value>]
[--cli-input-json <value>]
[--generate-cli-skeleton <value>]


--endpoint-config-name (string)

The name of the endpoint configuration. You specify this name in a CreateEndpoint request.

--production-variants (list)

An list of ProductionVariant objects, one for each model that you want to host at this endpoint.

Shorthand Syntax:

VariantName=string,ModelName=string,InitialInstanceCount=integer,InstanceType=string,InitialVariantWeight=float,AcceleratorType=string ...

JSON Syntax:

    "VariantName": "string",
    "ModelName": "string",
    "InitialInstanceCount": integer,
    "InstanceType": "ml.t2.medium"|"ml.t2.large"|"ml.t2.xlarge"|"ml.t2.2xlarge"|"ml.m4.xlarge"|"ml.m4.2xlarge"|"ml.m4.4xlarge"|"ml.m4.10xlarge"|"ml.m4.16xlarge"|"ml.m5.large"|"ml.m5.xlarge"|"ml.m5.2xlarge"|"ml.m5.4xlarge"|"ml.m5.12xlarge"|"ml.m5.24xlarge"|"ml.c4.large"|"ml.c4.xlarge"|"ml.c4.2xlarge"|"ml.c4.4xlarge"|"ml.c4.8xlarge"|"ml.p2.xlarge"|"ml.p2.8xlarge"|"ml.p2.16xlarge"|"ml.p3.2xlarge"|"ml.p3.8xlarge"|"ml.p3.16xlarge"|"ml.c5.large"|"ml.c5.xlarge"|"ml.c5.2xlarge"|"ml.c5.4xlarge"|"ml.c5.9xlarge"|"ml.c5.18xlarge",
    "InitialVariantWeight": float,
    "AcceleratorType": "ml.eia1.medium"|"ml.eia1.large"|"ml.eia1.xlarge"

--tags (list)

A list of key-value pairs. For more information, see Using Cost Allocation Tags in the AWS Billing and Cost Management User Guide .

Shorthand Syntax:

Key=string,Value=string ...

JSON Syntax:

    "Key": "string",
    "Value": "string"

--kms-key-id (string)

The Amazon Resource Name (ARN) of a AWS Key Management Service key that Amazon SageMaker uses to encrypt data on the storage volume attached to the ML compute instance that hosts the endpoint.

--cli-input-json (string) Performs service operation based on the JSON string provided. The JSON string follows the format provided by --generate-cli-skeleton. If other arguments are provided on the command line, the CLI values will override the JSON-provided values. It is not possible to pass arbitrary binary values using a JSON-provided value as the string will be taken literally.

--generate-cli-skeleton (string) Prints a JSON skeleton to standard output without sending an API request. If provided with no value or the value input, prints a sample input JSON that can be used as an argument for --cli-input-json. If provided with the value output, it validates the command inputs and returns a sample output JSON for that command.

See 'aws help' for descriptions of global parameters.


EndpointConfigArn -> (string)

The Amazon Resource Name (ARN) of the endpoint configuration.