InstanceProductionVariantProps

class aws_cdk.aws_sagemaker_alpha.InstanceProductionVariantProps(*, model, variant_name, accelerator_type=None, initial_instance_count=None, initial_variant_weight=None, instance_type=None)

Bases: object

(experimental) Construction properties for an instance production variant.

Parameters:

model (IModel) – (experimental) The model to host.
variant_name (str) – (experimental) Name of the production variant.
accelerator_type (Optional[AcceleratorType]) – (experimental) The size of the Elastic Inference (EI) instance to use for the production variant. EI instances provide on-demand GPU computing for inference. Default: - none
initial_instance_count (Union[int, float, None]) – (experimental) Number of instances to launch initially. Default: 1
initial_variant_weight (Union[int, float, None]) – (experimental) Determines initial traffic distribution among all of the models that you specify in the endpoint configuration. The traffic to a production variant is determined by the ratio of the variant weight to the sum of all variant weight values across all production variants. Default: 1.0
instance_type (Optional[InstanceType]) – (experimental) Instance type of the production variant. Default: InstanceType.T2_MEDIUM

Stability:

experimental

ExampleMetadata:

fixture=_generated

Example:

# The code below shows an example of how to instantiate this type.
# The values are placeholders you should change.
import aws_cdk.aws_sagemaker_alpha as sagemaker_alpha

# accelerator_type: sagemaker_alpha.AcceleratorType
# instance_type: sagemaker_alpha.InstanceType
# model: sagemaker_alpha.Model

instance_production_variant_props = sagemaker_alpha.InstanceProductionVariantProps(
    model=model,
    variant_name="variantName",

    # the properties below are optional
    accelerator_type=accelerator_type,
    initial_instance_count=123,
    initial_variant_weight=123,
    instance_type=instance_type
)

Attributes

accelerator_type

(experimental) The size of the Elastic Inference (EI) instance to use for the production variant.

EI instances provide on-demand GPU computing for inference.

Default:

none

Stability:

experimental

initial_instance_count

(experimental) Number of instances to launch initially.

Default:: 1
Stability:: experimental

initial_variant_weight

(experimental) Determines initial traffic distribution among all of the models that you specify in the endpoint configuration.

The traffic to a production variant is determined by the ratio of the variant weight to the sum of all variant weight values across all production variants.

Default:: 1.0
Stability:: experimental

instance_type

(experimental) Instance type of the production variant.

Default:: InstanceType.T2_MEDIUM
Stability:: experimental

model

(experimental) The model to host.

Stability:: experimental

variant_name

(experimental) Name of the production variant.

Stability:: experimental