ProductionVariantSummary - Amazon SageMaker

ProductionVariantSummary

Describes weight and capacities for a production variant associated with an endpoint. If you sent a request to the UpdateEndpointWeightsAndCapacities API and the endpoint status is Updating, you get different desired and current values.

Contents

VariantName

The name of the variant.

Type: String

Length Constraints: Maximum length of 63.

Pattern: ^[a-zA-Z0-9](-*[a-zA-Z0-9]){0,62}

Required: Yes

CurrentInstanceCount

The number of instances associated with the variant.

Type: Integer

Valid Range: Minimum value of 0.

Required: No

CurrentServerlessConfig

The serverless configuration for the endpoint.

Type: ProductionVariantServerlessConfig object

Required: No

CurrentWeight

The weight associated with the variant.

Type: Float

Valid Range: Minimum value of 0.

Required: No

DeployedImages

An array of DeployedImage objects that specify the Amazon EC2 Container Registry paths of the inference images deployed on instances of this ProductionVariant.

Type: Array of DeployedImage objects

Required: No

DesiredInstanceCount

The number of instances requested in the UpdateEndpointWeightsAndCapacities request.

Type: Integer

Valid Range: Minimum value of 0.

Required: No

DesiredServerlessConfig

The serverless configuration requested for the endpoint update.

Type: ProductionVariantServerlessConfig object

Required: No

DesiredWeight

The requested weight, as specified in the UpdateEndpointWeightsAndCapacities request.

Type: Float

Valid Range: Minimum value of 0.

Required: No

ManagedInstanceScaling

Settings that control the range in the number of instances that the endpoint provisions as it scales up or down to accommodate traffic.

Type: ProductionVariantManagedInstanceScaling object

Required: No

RoutingConfig

Settings that control how the endpoint routes incoming traffic to the instances that the endpoint hosts.

Type: ProductionVariantRoutingConfig object

Required: No

VariantStatus

The endpoint variant status which describes the current deployment stage status or operational status.

Type: Array of ProductionVariantStatus objects

Array Members: Minimum number of 0 items. Maximum number of 5 items.

Required: No

See Also

For more information about using this API in one of the language-specific AWS SDKs, see the following: