ScalableInstanceCount

class aws_cdk.aws_sagemaker_alpha.ScalableInstanceCount(scope, id, *, dimension, resource_id, role, service_namespace, max_capacity, min_capacity=None)

Bases: BaseScalableAttribute

(experimental) A scalable sagemaker endpoint attribute.

Stability:

experimental

ExampleMetadata:

infused

Example:

import aws_cdk.aws_sagemaker_alpha as sagemaker

# model: sagemaker.Model


variant_name = "my-variant"
endpoint_config = sagemaker.EndpointConfig(self, "EndpointConfig",
    instance_production_variants=[sagemaker.InstanceProductionVariantProps(
        model=model,
        variant_name=variant_name
    )
    ]
)

endpoint = sagemaker.Endpoint(self, "Endpoint", endpoint_config=endpoint_config)
production_variant = endpoint.find_instance_production_variant(variant_name)
instance_count = production_variant.auto_scale_instance_count(
    max_capacity=3
)
instance_count.scale_on_invocations("LimitRPS",
    max_requests_per_second=30
)

(experimental) Constructs a new instance of the ScalableInstanceCount class.

Parameters:
  • scope (Construct) –

  • id (str) –

  • dimension (str) – Scalable dimension of the attribute.

  • resource_id (str) – Resource ID of the attribute.

  • role (IRole) – Role to use for scaling.

  • service_namespace (ServiceNamespace) – Service namespace of the scalable attribute.

  • max_capacity (Union[int, float]) – Maximum capacity to scale to.

  • min_capacity (Union[int, float, None]) – Minimum capacity to scale to. Default: 1

Stability:

experimental

Methods

scale_on_invocations(id, *, max_requests_per_second, safety_factor=None, disable_scale_in=None, policy_name=None, scale_in_cooldown=None, scale_out_cooldown=None)

(experimental) Scales in or out to achieve a target requests per second per instance.

Parameters:
  • id (str) –

  • max_requests_per_second (Union[int, float]) – (experimental) Max RPS per instance used for calculating the target SageMaker variant invocation per instance. More documentation available here: https://docs.aws.amazon.com/sagemaker/latest/dg/endpoint-scaling-loadtest.html

  • safety_factor (Union[int, float, None]) – (experimental) Safty factor for calculating the target SageMaker variant invocation per instance. More documentation available here: https://docs.aws.amazon.com/sagemaker/latest/dg/endpoint-scaling-loadtest.html Default: 0.5

  • disable_scale_in (Optional[bool]) – Indicates whether scale in by the target tracking policy is disabled. If the value is true, scale in is disabled and the target tracking policy won’t remove capacity from the scalable resource. Otherwise, scale in is enabled and the target tracking policy can remove capacity from the scalable resource. Default: false

  • policy_name (Optional[str]) – A name for the scaling policy. Default: - Automatically generated name.

  • scale_in_cooldown (Optional[Duration]) – Period after a scale in activity completes before another scale in activity can start. Default: Duration.seconds(300) for the following scalable targets: ECS services, Spot Fleet requests, EMR clusters, AppStream 2.0 fleets, Aurora DB clusters, Amazon SageMaker endpoint variants, Custom resources. For all other scalable targets, the default value is Duration.seconds(0): DynamoDB tables, DynamoDB global secondary indexes, Amazon Comprehend document classification endpoints, Lambda provisioned concurrency

  • scale_out_cooldown (Optional[Duration]) – Period after a scale out activity completes before another scale out activity can start. Default: Duration.seconds(300) for the following scalable targets: ECS services, Spot Fleet requests, EMR clusters, AppStream 2.0 fleets, Aurora DB clusters, Amazon SageMaker endpoint variants, Custom resources. For all other scalable targets, the default value is Duration.seconds(0): DynamoDB tables, DynamoDB global secondary indexes, Amazon Comprehend document classification endpoints, Lambda provisioned concurrency

Stability:

experimental

Return type:

None

to_string()

Returns a string representation of this construct.

Return type:

str

Attributes

node

The tree node.

Static Methods

classmethod is_construct(x)

Checks if x is a construct.

Use this method instead of instanceof to properly detect Construct instances, even when the construct library is symlinked.

Explanation: in JavaScript, multiple copies of the constructs library on disk are seen as independent, completely different libraries. As a consequence, the class Construct in each copy of the constructs library is seen as a different class, and an instance of one class will not test as instanceof the other class. npm install will not create installations like this, but users may manually symlink construct libraries together or use a monorepo tool: in those cases, multiple copies of the constructs library can be accidentally installed, and instanceof will behave unpredictably. It is safest to avoid using instanceof, and using this type-testing method instead.

Parameters:

x (Any) – Any object.

Return type:

bool

Returns:

true if x is an object created from a class which extends Construct.