InstanceProductionVariantProps

class aws_cdk.aws_sagemaker_alpha.InstanceProductionVariantProps(*, model, variant_name, accelerator_type=None, initial_instance_count=None, initial_variant_weight=None, instance_type=None)

Bases: object

(experimental) Construction properties for an instance production variant.

Parameters:
  • model (IModel) – (experimental) The model to host.

  • variant_name (str) – (experimental) Name of the production variant.

  • accelerator_type (Optional[AcceleratorType]) – (experimental) The size of the Elastic Inference (EI) instance to use for the production variant. EI instances provide on-demand GPU computing for inference. Default: - none

  • initial_instance_count (Union[int, float, None]) – (experimental) Number of instances to launch initially. Default: 1

  • initial_variant_weight (Union[int, float, None]) – (experimental) Determines initial traffic distribution among all of the models that you specify in the endpoint configuration. The traffic to a production variant is determined by the ratio of the variant weight to the sum of all variant weight values across all production variants. Default: 1.0

  • instance_type (Optional[InstanceType]) – (experimental) Instance type of the production variant. Default: InstanceType.T2_MEDIUM

Stability:

experimental

ExampleMetadata:

fixture=_generated

Example:

# The code below shows an example of how to instantiate this type.
# The values are placeholders you should change.
import aws_cdk.aws_sagemaker_alpha as sagemaker_alpha

# accelerator_type: sagemaker_alpha.AcceleratorType
# instance_type: sagemaker_alpha.InstanceType
# model: sagemaker_alpha.Model

instance_production_variant_props = sagemaker_alpha.InstanceProductionVariantProps(
    model=model,
    variant_name="variantName",

    # the properties below are optional
    accelerator_type=accelerator_type,
    initial_instance_count=123,
    initial_variant_weight=123,
    instance_type=instance_type
)

Attributes

accelerator_type

(experimental) The size of the Elastic Inference (EI) instance to use for the production variant.

EI instances provide on-demand GPU computing for inference.

Default:
  • none

Stability:

experimental

initial_instance_count

(experimental) Number of instances to launch initially.

Default:

1

Stability:

experimental

initial_variant_weight

(experimental) Determines initial traffic distribution among all of the models that you specify in the endpoint configuration.

The traffic to a production variant is determined by the ratio of the variant weight to the sum of all variant weight values across all production variants.

Default:

1.0

Stability:

experimental

instance_type

(experimental) Instance type of the production variant.

Default:

InstanceType.T2_MEDIUM

Stability:

experimental

model

(experimental) The model to host.

Stability:

experimental

variant_name

(experimental) Name of the production variant.

Stability:

experimental