CfnCluster
- class aws_cdk.aws_sagemaker.CfnCluster(scope, id, *, cluster_name=None, instance_groups=None, node_provisioning_mode=None, node_recovery=None, orchestrator=None, restricted_instance_groups=None, tags=None, vpc_config=None)
Bases:
CfnResource
Creates a SageMaker HyperPod cluster.
SageMaker HyperPod is a capability of SageMaker for creating and managing persistent clusters for developing large machine learning models, such as large language models (LLMs) and diffusion models. To learn more, see Amazon SageMaker HyperPod in the Amazon SageMaker Developer Guide .
- See:
http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-sagemaker-cluster.html
- CloudformationResource:
AWS::SageMaker::Cluster
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker cfn_cluster = sagemaker.CfnCluster(self, "MyCfnCluster", cluster_name="clusterName", instance_groups=[sagemaker.CfnCluster.ClusterInstanceGroupProperty( execution_role="executionRole", instance_count=123, instance_group_name="instanceGroupName", instance_type="instanceType", life_cycle_config=sagemaker.CfnCluster.ClusterLifeCycleConfigProperty( on_create="onCreate", source_s3_uri="sourceS3Uri" ), # the properties below are optional current_count=123, image_id="imageId", instance_storage_configs=[sagemaker.CfnCluster.ClusterInstanceStorageConfigProperty( ebs_volume_config=sagemaker.CfnCluster.ClusterEbsVolumeConfigProperty( volume_size_in_gb=123 ) )], on_start_deep_health_checks=["onStartDeepHealthChecks"], override_vpc_config=sagemaker.CfnCluster.VpcConfigProperty( security_group_ids=["securityGroupIds"], subnets=["subnets"] ), scheduled_update_config=sagemaker.CfnCluster.ScheduledUpdateConfigProperty( schedule_expression="scheduleExpression", # the properties below are optional deployment_config=sagemaker.CfnCluster.DeploymentConfigProperty( auto_rollback_configuration=[sagemaker.CfnCluster.AlarmDetailsProperty( alarm_name="alarmName" )], rolling_update_policy=sagemaker.CfnCluster.RollingUpdatePolicyProperty( maximum_batch_size=sagemaker.CfnCluster.CapacitySizeConfigProperty( type="type", value=123 ), # the properties below are optional rollback_maximum_batch_size=sagemaker.CfnCluster.CapacitySizeConfigProperty( type="type", value=123 ) ), wait_interval_in_seconds=123 ) ), threads_per_core=123, training_plan_arn="trainingPlanArn" )], node_provisioning_mode="nodeProvisioningMode", node_recovery="nodeRecovery", orchestrator=sagemaker.CfnCluster.OrchestratorProperty( eks=sagemaker.CfnCluster.ClusterOrchestratorEksConfigProperty( cluster_arn="clusterArn" ) ), restricted_instance_groups=[sagemaker.CfnCluster.ClusterRestrictedInstanceGroupProperty( environment_config=sagemaker.CfnCluster.EnvironmentConfigProperty( f_sx_lustre_config=sagemaker.CfnCluster.FSxLustreConfigProperty( per_unit_storage_throughput=123, size_in_gi_b=123 ) ), execution_role="executionRole", instance_count=123, instance_group_name="instanceGroupName", instance_type="instanceType", # the properties below are optional current_count=123, instance_storage_configs=[sagemaker.CfnCluster.ClusterInstanceStorageConfigProperty( ebs_volume_config=sagemaker.CfnCluster.ClusterEbsVolumeConfigProperty( volume_size_in_gb=123 ) )], on_start_deep_health_checks=["onStartDeepHealthChecks"], override_vpc_config=sagemaker.CfnCluster.VpcConfigProperty( security_group_ids=["securityGroupIds"], subnets=["subnets"] ), threads_per_core=123, training_plan_arn="trainingPlanArn" )], tags=[CfnTag( key="key", value="value" )], vpc_config=sagemaker.CfnCluster.VpcConfigProperty( security_group_ids=["securityGroupIds"], subnets=["subnets"] ) )
- Parameters:
scope (
Construct
) – Scope in which this resource is defined.id (
str
) – Construct identifier for this resource (unique in its scope).cluster_name (
Optional
[str
]) – The name of the SageMaker HyperPod cluster.instance_groups (
Union
[IResolvable
,Sequence
[Union
[IResolvable
,ClusterInstanceGroupProperty
,Dict
[str
,Any
]]],None
]) – The instance groups of the SageMaker HyperPod cluster. To delete an instance group, remove it from the array.node_provisioning_mode (
Optional
[str
]) – Determines the scaling strategy for the SageMaker HyperPod cluster. When set to ‘Continuous’, enables continuous scaling which dynamically manages node provisioning. If the parameter is omitted, uses the standard scaling approach in previous release.node_recovery (
Optional
[str
]) – Specifies whether to enable or disable the automatic node recovery feature of SageMaker HyperPod. Available values areAutomatic
for enabling andNone
for disabling.orchestrator (
Union
[IResolvable
,OrchestratorProperty
,Dict
[str
,Any
],None
]) – The orchestrator type for the SageMaker HyperPod cluster. Currently,'eks'
is the only available option.restricted_instance_groups (
Union
[IResolvable
,Sequence
[Union
[IResolvable
,ClusterRestrictedInstanceGroupProperty
,Dict
[str
,Any
]]],None
]) – The restricted instance groups of the SageMaker HyperPod cluster.tags (
Optional
[Sequence
[Union
[CfnTag
,Dict
[str
,Any
]]]]) – A tag object that consists of a key and an optional value, used to manage metadata for SageMaker AWS resources. You can add tags to notebook instances, training jobs, hyperparameter tuning jobs, batch transform jobs, models, labeling jobs, work teams, endpoint configurations, and endpoints. For more information on adding tags to SageMaker resources, see AddTags . For more information on adding metadata to your AWS resources with tagging, see Tagging AWS resources . For advice on best practices for managing AWS resources with tagging, see Tagging Best Practices: Implement an Effective AWS Resource Tagging Strategy .vpc_config (
Union
[IResolvable
,VpcConfigProperty
,Dict
[str
,Any
],None
]) – Specifies an Amazon Virtual Private Cloud (VPC) that your SageMaker jobs, hosted models, and compute resources have access to. You can control access to and from your resources by configuring a VPC. For more information, see Give SageMaker Access to Resources in your Amazon VPC .
Methods
- add_deletion_override(path)
Syntactic sugar for
addOverride(path, undefined)
.- Parameters:
path (
str
) – The path of the value to delete.- Return type:
None
- add_dependency(target)
Indicates that this resource depends on another resource and cannot be provisioned unless the other resource has been successfully provisioned.
This can be used for resources across stacks (or nested stack) boundaries and the dependency will automatically be transferred to the relevant scope.
- Parameters:
target (
CfnResource
)- Return type:
None
- add_depends_on(target)
(deprecated) Indicates that this resource depends on another resource and cannot be provisioned unless the other resource has been successfully provisioned.
- Parameters:
target (
CfnResource
)- Deprecated:
use addDependency
- Stability:
deprecated
- Return type:
None
- add_metadata(key, value)
Add a value to the CloudFormation Resource Metadata.
- Parameters:
key (
str
)value (
Any
)
- See:
- Return type:
None
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/metadata-section-structure.html
Note that this is a different set of metadata from CDK node metadata; this metadata ends up in the stack template under the resource, whereas CDK node metadata ends up in the Cloud Assembly.
- add_override(path, value)
Adds an override to the synthesized CloudFormation resource.
To add a property override, either use
addPropertyOverride
or prefixpath
with “Properties.” (i.e.Properties.TopicName
).If the override is nested, separate each nested level using a dot (.) in the path parameter. If there is an array as part of the nesting, specify the index in the path.
To include a literal
.
in the property name, prefix with a\
. In most programming languages you will need to write this as"\\."
because the\
itself will need to be escaped.For example:
cfn_resource.add_override("Properties.GlobalSecondaryIndexes.0.Projection.NonKeyAttributes", ["myattribute"]) cfn_resource.add_override("Properties.GlobalSecondaryIndexes.1.ProjectionType", "INCLUDE")
would add the overrides Example:
"Properties": { "GlobalSecondaryIndexes": [ { "Projection": { "NonKeyAttributes": [ "myattribute" ] ... } ... }, { "ProjectionType": "INCLUDE" ... }, ] ... }
The
value
argument toaddOverride
will not be processed or translated in any way. Pass raw JSON values in here with the correct capitalization for CloudFormation. If you pass CDK classes or structs, they will be rendered with lowercased key names, and CloudFormation will reject the template.- Parameters:
path (
str
) –The path of the property, you can use dot notation to override values in complex types. Any intermediate keys will be created as needed.
value (
Any
) –The value. Could be primitive or complex.
- Return type:
None
- add_property_deletion_override(property_path)
Adds an override that deletes the value of a property from the resource definition.
- Parameters:
property_path (
str
) – The path to the property.- Return type:
None
- add_property_override(property_path, value)
Adds an override to a resource property.
Syntactic sugar for
addOverride("Properties.<...>", value)
.- Parameters:
property_path (
str
) – The path of the property.value (
Any
) – The value.
- Return type:
None
- apply_removal_policy(policy=None, *, apply_to_update_replace_policy=None, default=None)
Sets the deletion policy of the resource based on the removal policy specified.
The Removal Policy controls what happens to this resource when it stops being managed by CloudFormation, either because you’ve removed it from the CDK application or because you’ve made a change that requires the resource to be replaced.
The resource can be deleted (
RemovalPolicy.DESTROY
), or left in your AWS account for data recovery and cleanup later (RemovalPolicy.RETAIN
). In some cases, a snapshot can be taken of the resource prior to deletion (RemovalPolicy.SNAPSHOT
). A list of resources that support this policy can be found in the following link:- Parameters:
policy (
Optional
[RemovalPolicy
])apply_to_update_replace_policy (
Optional
[bool
]) – Apply the same deletion policy to the resource’s “UpdateReplacePolicy”. Default: truedefault (
Optional
[RemovalPolicy
]) – The default policy to apply in case the removal policy is not defined. Default: - Default value is resource specific. To determine the default value for a resource, please consult that specific resource’s documentation.
- See:
- Return type:
None
- get_att(attribute_name, type_hint=None)
Returns a token for an runtime attribute of this resource.
Ideally, use generated attribute accessors (e.g.
resource.arn
), but this can be used for future compatibility in case there is no generated attribute.- Parameters:
attribute_name (
str
) – The name of the attribute.type_hint (
Optional
[ResolutionTypeHint
])
- Return type:
- get_metadata(key)
Retrieve a value value from the CloudFormation Resource Metadata.
- Parameters:
key (
str
)- See:
- Return type:
Any
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/metadata-section-structure.html
Note that this is a different set of metadata from CDK node metadata; this metadata ends up in the stack template under the resource, whereas CDK node metadata ends up in the Cloud Assembly.
- inspect(inspector)
Examines the CloudFormation resource and discloses attributes.
- Parameters:
inspector (
TreeInspector
) – tree inspector to collect and process attributes.- Return type:
None
- obtain_dependencies()
Retrieves an array of resources this resource depends on.
This assembles dependencies on resources across stacks (including nested stacks) automatically.
- Return type:
List
[Union
[Stack
,CfnResource
]]
- obtain_resource_dependencies()
Get a shallow copy of dependencies between this resource and other resources in the same stack.
- Return type:
List
[CfnResource
]
- override_logical_id(new_logical_id)
Overrides the auto-generated logical ID with a specific ID.
- Parameters:
new_logical_id (
str
) – The new logical ID to use for this stack element.- Return type:
None
- remove_dependency(target)
Indicates that this resource no longer depends on another resource.
This can be used for resources across stacks (including nested stacks) and the dependency will automatically be removed from the relevant scope.
- Parameters:
target (
CfnResource
)- Return type:
None
- replace_dependency(target, new_target)
Replaces one dependency with another.
- Parameters:
target (
CfnResource
) – The dependency to replace.new_target (
CfnResource
) – The new dependency to add.
- Return type:
None
- to_string()
Returns a string representation of this construct.
- Return type:
str
- Returns:
a string representation of this resource
Attributes
- CFN_RESOURCE_TYPE_NAME = 'AWS::SageMaker::Cluster'
- attr_cluster_arn
The Amazon Resource Name (ARN) of the SageMaker HyperPod cluster.
- CloudformationAttribute:
ClusterArn
- attr_cluster_status
The status of the SageMaker HyperPod cluster.
- CloudformationAttribute:
ClusterStatus
- attr_creation_time
The time when the SageMaker HyperPod cluster is created.
- CloudformationAttribute:
CreationTime
- attr_failure_message
The failure message of the SageMaker HyperPod cluster.
- CloudformationAttribute:
FailureMessage
- cdk_tag_manager
Tag Manager which manages the tags for this resource.
- cfn_options
Options for this resource, such as condition, update policy etc.
- cfn_resource_type
AWS resource type.
- cluster_name
The name of the SageMaker HyperPod cluster.
- cluster_ref
A reference to a Cluster resource.
- creation_stack
return:
the stack trace of the point where this Resource was created from, sourced from the +metadata+ entry typed +aws:cdk:logicalId+, and with the bottom-most node +internal+ entries filtered.
- instance_groups
The instance groups of the SageMaker HyperPod cluster.
- logical_id
The logical ID for this CloudFormation stack element.
The logical ID of the element is calculated from the path of the resource node in the construct tree.
To override this value, use
overrideLogicalId(newLogicalId)
.- Returns:
the logical ID as a stringified token. This value will only get resolved during synthesis.
- node
The tree node.
- node_provisioning_mode
Determines the scaling strategy for the SageMaker HyperPod cluster.
- node_recovery
Specifies whether to enable or disable the automatic node recovery feature of SageMaker HyperPod.
- orchestrator
The orchestrator type for the SageMaker HyperPod cluster.
- ref
Return a string that will be resolved to a CloudFormation
{ Ref }
for this element.If, by any chance, the intrinsic reference of a resource is not a string, you could coerce it to an IResolvable through
Lazy.any({ produce: resource.ref })
.
- restricted_instance_groups
The restricted instance groups of the SageMaker HyperPod cluster.
- stack
The stack in which this element is defined.
CfnElements must be defined within a stack scope (directly or indirectly).
- tags
A tag object that consists of a key and an optional value, used to manage metadata for SageMaker AWS resources.
- vpc_config
Specifies an Amazon Virtual Private Cloud (VPC) that your SageMaker jobs, hosted models, and compute resources have access to.
Static Methods
- classmethod is_cfn_element(x)
Returns
true
if a construct is a stack element (i.e. part of the synthesized cloudformation template).Uses duck-typing instead of
instanceof
to allow stack elements from different versions of this library to be included in the same stack.- Parameters:
x (
Any
)- Return type:
bool
- Returns:
The construct as a stack element or undefined if it is not a stack element.
- classmethod is_cfn_resource(x)
Check whether the given object is a CfnResource.
- Parameters:
x (
Any
)- Return type:
bool
- classmethod is_construct(x)
Checks if
x
is a construct.Use this method instead of
instanceof
to properly detectConstruct
instances, even when the construct library is symlinked.Explanation: in JavaScript, multiple copies of the
constructs
library on disk are seen as independent, completely different libraries. As a consequence, the classConstruct
in each copy of theconstructs
library is seen as a different class, and an instance of one class will not test asinstanceof
the other class.npm install
will not create installations like this, but users may manually symlink construct libraries together or use a monorepo tool: in those cases, multiple copies of theconstructs
library can be accidentally installed, andinstanceof
will behave unpredictably. It is safest to avoid usinginstanceof
, and using this type-testing method instead.- Parameters:
x (
Any
) – Any object.- Return type:
bool
- Returns:
true if
x
is an object created from a class which extendsConstruct
.
AlarmDetailsProperty
- class CfnCluster.AlarmDetailsProperty(*, alarm_name)
Bases:
object
The details of the alarm to monitor during the AMI update.
- Parameters:
alarm_name (
str
) – The name of the alarm.- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker alarm_details_property = sagemaker.CfnCluster.AlarmDetailsProperty( alarm_name="alarmName" )
Attributes
CapacitySizeConfigProperty
- class CfnCluster.CapacitySizeConfigProperty(*, type, value)
Bases:
object
The configuration of the size measurements of the AMI update.
Using this configuration, you can specify whether SageMaker should update your instance group by an amount or percentage of instances.
- Parameters:
type (
str
) – Specifies whether SageMaker should process the update by amount or percentage of instances.value (
Union
[int
,float
]) – Specifies the amount or percentage of instances SageMaker updates at a time.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker capacity_size_config_property = sagemaker.CfnCluster.CapacitySizeConfigProperty( type="type", value=123 )
Attributes
- type
Specifies whether SageMaker should process the update by amount or percentage of instances.
- value
Specifies the amount or percentage of instances SageMaker updates at a time.
ClusterEbsVolumeConfigProperty
- class CfnCluster.ClusterEbsVolumeConfigProperty(*, volume_size_in_gb=None)
Bases:
object
Defines the configuration for attaching an additional Amazon Elastic Block Store (EBS) volume to each instance of the SageMaker HyperPod cluster instance group.
To learn more, see SageMaker HyperPod release notes: June 20, 2024 .
- Parameters:
volume_size_in_gb (
Union
[int
,float
,None
]) – The size in gigabytes (GB) of the additional EBS volume to be attached to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to/opt/sagemaker
.- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker cluster_ebs_volume_config_property = sagemaker.CfnCluster.ClusterEbsVolumeConfigProperty( volume_size_in_gb=123 )
Attributes
- volume_size_in_gb
The size in gigabytes (GB) of the additional EBS volume to be attached to the instances in the SageMaker HyperPod cluster instance group.
The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to
/opt/sagemaker
.
ClusterInstanceGroupProperty
- class CfnCluster.ClusterInstanceGroupProperty(*, execution_role, instance_count, instance_group_name, instance_type, life_cycle_config, current_count=None, image_id=None, instance_storage_configs=None, on_start_deep_health_checks=None, override_vpc_config=None, scheduled_update_config=None, threads_per_core=None, training_plan_arn=None)
Bases:
object
The configuration information of the instance group within the HyperPod cluster.
- Parameters:
execution_role (
str
) – The execution role for the instance group to assume.instance_count (
Union
[int
,float
]) – The number of instances in an instance group of the SageMaker HyperPod cluster.instance_group_name (
str
) – The name of the instance group of a SageMaker HyperPod cluster.instance_type (
str
) – The instance type of the instance group of a SageMaker HyperPod cluster.life_cycle_config (
Union
[IResolvable
,ClusterLifeCycleConfigProperty
,Dict
[str
,Any
]]) – The lifecycle configuration for a SageMaker HyperPod cluster.current_count (
Union
[int
,float
,None
]) – The number of instances that are currently in the instance group of a SageMaker HyperPod cluster.image_id (
Optional
[str
]) – AMI Id to be used for launching EC2 instances - HyperPodPublicAmiId or CustomAmiId.instance_storage_configs (
Union
[IResolvable
,Sequence
[Union
[IResolvable
,ClusterInstanceStorageConfigProperty
,Dict
[str
,Any
]]],None
]) – The configurations of additional storage specified to the instance group where the instance (node) is launched.on_start_deep_health_checks (
Optional
[Sequence
[str
]]) – A flag indicating whether deep health checks should be performed when the HyperPod cluster instance group is created or updated. Deep health checks are comprehensive, invasive tests that validate the health of the underlying hardware and infrastructure components.override_vpc_config (
Union
[IResolvable
,VpcConfigProperty
,Dict
[str
,Any
],None
]) – The customized Amazon VPC configuration at the instance group level that overrides the default Amazon VPC configuration of the SageMaker HyperPod cluster.scheduled_update_config (
Union
[IResolvable
,ScheduledUpdateConfigProperty
,Dict
[str
,Any
],None
]) – The configuration object of the schedule that SageMaker follows when updating the AMI.threads_per_core (
Union
[int
,float
,None
]) – The number of threads per CPU core you specified underCreateCluster
.training_plan_arn (
Optional
[str
]) – The Amazon Resource Name (ARN) of the training plan to use for this cluster instance group. For more information about how to reserve GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see CreateTrainingPlan.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker cluster_instance_group_property = sagemaker.CfnCluster.ClusterInstanceGroupProperty( execution_role="executionRole", instance_count=123, instance_group_name="instanceGroupName", instance_type="instanceType", life_cycle_config=sagemaker.CfnCluster.ClusterLifeCycleConfigProperty( on_create="onCreate", source_s3_uri="sourceS3Uri" ), # the properties below are optional current_count=123, image_id="imageId", instance_storage_configs=[sagemaker.CfnCluster.ClusterInstanceStorageConfigProperty( ebs_volume_config=sagemaker.CfnCluster.ClusterEbsVolumeConfigProperty( volume_size_in_gb=123 ) )], on_start_deep_health_checks=["onStartDeepHealthChecks"], override_vpc_config=sagemaker.CfnCluster.VpcConfigProperty( security_group_ids=["securityGroupIds"], subnets=["subnets"] ), scheduled_update_config=sagemaker.CfnCluster.ScheduledUpdateConfigProperty( schedule_expression="scheduleExpression", # the properties below are optional deployment_config=sagemaker.CfnCluster.DeploymentConfigProperty( auto_rollback_configuration=[sagemaker.CfnCluster.AlarmDetailsProperty( alarm_name="alarmName" )], rolling_update_policy=sagemaker.CfnCluster.RollingUpdatePolicyProperty( maximum_batch_size=sagemaker.CfnCluster.CapacitySizeConfigProperty( type="type", value=123 ), # the properties below are optional rollback_maximum_batch_size=sagemaker.CfnCluster.CapacitySizeConfigProperty( type="type", value=123 ) ), wait_interval_in_seconds=123 ) ), threads_per_core=123, training_plan_arn="trainingPlanArn" )
Attributes
- current_count
The number of instances that are currently in the instance group of a SageMaker HyperPod cluster.
- execution_role
The execution role for the instance group to assume.
- image_id
AMI Id to be used for launching EC2 instances - HyperPodPublicAmiId or CustomAmiId.
- instance_count
The number of instances in an instance group of the SageMaker HyperPod cluster.
- instance_group_name
The name of the instance group of a SageMaker HyperPod cluster.
- instance_storage_configs
The configurations of additional storage specified to the instance group where the instance (node) is launched.
- instance_type
The instance type of the instance group of a SageMaker HyperPod cluster.
- life_cycle_config
The lifecycle configuration for a SageMaker HyperPod cluster.
- on_start_deep_health_checks
A flag indicating whether deep health checks should be performed when the HyperPod cluster instance group is created or updated.
Deep health checks are comprehensive, invasive tests that validate the health of the underlying hardware and infrastructure components.
- override_vpc_config
The customized Amazon VPC configuration at the instance group level that overrides the default Amazon VPC configuration of the SageMaker HyperPod cluster.
- scheduled_update_config
The configuration object of the schedule that SageMaker follows when updating the AMI.
- threads_per_core
The number of threads per CPU core you specified under
CreateCluster
.
- training_plan_arn
The Amazon Resource Name (ARN) of the training plan to use for this cluster instance group.
For more information about how to reserve GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see CreateTrainingPlan.
ClusterInstanceStorageConfigProperty
- class CfnCluster.ClusterInstanceStorageConfigProperty(*, ebs_volume_config=None)
Bases:
object
Defines the configuration for attaching additional storage to the instances in the SageMaker HyperPod cluster instance group.
To learn more, see SageMaker HyperPod release notes: June 20, 2024 .
- Parameters:
ebs_volume_config (
Union
[IResolvable
,ClusterEbsVolumeConfigProperty
,Dict
[str
,Any
],None
]) – Defines the configuration for attaching additional Amazon Elastic Block Store (EBS) volumes to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to/opt/sagemaker
.- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker cluster_instance_storage_config_property = sagemaker.CfnCluster.ClusterInstanceStorageConfigProperty( ebs_volume_config=sagemaker.CfnCluster.ClusterEbsVolumeConfigProperty( volume_size_in_gb=123 ) )
Attributes
- ebs_volume_config
Defines the configuration for attaching additional Amazon Elastic Block Store (EBS) volumes to the instances in the SageMaker HyperPod cluster instance group.
The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to
/opt/sagemaker
.
ClusterLifeCycleConfigProperty
- class CfnCluster.ClusterLifeCycleConfigProperty(*, on_create, source_s3_uri)
Bases:
object
The lifecycle configuration for a SageMaker HyperPod cluster.
- Parameters:
on_create (
str
) – The file name of the entrypoint script of lifecycle scripts underSourceS3Uri
. This entrypoint script runs during cluster creation.source_s3_uri (
str
) – An Amazon S3 bucket path where your lifecycle scripts are stored. .. epigraph:: Make sure that the S3 bucket path starts withs3://sagemaker-
. The IAM role for SageMaker HyperPod has the managed`AmazonSageMakerClusterInstanceRolePolicy
<https://docs.aws.amazon.com/sagemaker/latest/dg/security-iam-awsmanpol-cluster.html>`_ attached, which allows access to S3 buckets with the specific prefixsagemaker-
.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker cluster_life_cycle_config_property = sagemaker.CfnCluster.ClusterLifeCycleConfigProperty( on_create="onCreate", source_s3_uri="sourceS3Uri" )
Attributes
- on_create
The file name of the entrypoint script of lifecycle scripts under
SourceS3Uri
.This entrypoint script runs during cluster creation.
- source_s3_uri
An Amazon S3 bucket path where your lifecycle scripts are stored.
Make sure that the S3 bucket path starts with
s3://sagemaker-
. The IAM role for SageMaker HyperPod has the managed`AmazonSageMakerClusterInstanceRolePolicy
<https://docs.aws.amazon.com/sagemaker/latest/dg/security-iam-awsmanpol-cluster.html>`_ attached, which allows access to S3 buckets with the specific prefixsagemaker-
.
ClusterOrchestratorEksConfigProperty
- class CfnCluster.ClusterOrchestratorEksConfigProperty(*, cluster_arn)
Bases:
object
The configuration for the Amazon EKS cluster that is used as the orchestrator for the SageMaker HyperPod cluster.
This includes the Amazon Resource Name (ARN) of the EKS cluster
- Parameters:
cluster_arn (
str
) – The Amazon Resource Name (ARN) of the SageMaker HyperPod cluster.- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker cluster_orchestrator_eks_config_property = sagemaker.CfnCluster.ClusterOrchestratorEksConfigProperty( cluster_arn="clusterArn" )
Attributes
- cluster_arn
The Amazon Resource Name (ARN) of the SageMaker HyperPod cluster.
ClusterRestrictedInstanceGroupProperty
- class CfnCluster.ClusterRestrictedInstanceGroupProperty(*, environment_config, execution_role, instance_count, instance_group_name, instance_type, current_count=None, instance_storage_configs=None, on_start_deep_health_checks=None, override_vpc_config=None, threads_per_core=None, training_plan_arn=None)
Bases:
object
Details of a restricted instance group in a SageMaker HyperPod cluster.
- Parameters:
environment_config (
Union
[IResolvable
,EnvironmentConfigProperty
,Dict
[str
,Any
]]) – The configuration for the restricted instance groups (RIG) environment.execution_role (
str
) – The execution role for the instance group to assume.instance_count (
Union
[int
,float
]) – The number of instances you specified to add to the restricted instance group of a SageMaker HyperPod cluster.instance_group_name (
str
) – The name of the instance group of a SageMaker HyperPod cluster.instance_type (
str
) – The instance type of the instance group of a SageMaker HyperPod cluster.current_count (
Union
[int
,float
,None
]) – The number of instances that are currently in the restricted instance group of a SageMaker HyperPod cluster.instance_storage_configs (
Union
[IResolvable
,Sequence
[Union
[IResolvable
,ClusterInstanceStorageConfigProperty
,Dict
[str
,Any
]]],None
]) – The instance storage configuration for the instance group.on_start_deep_health_checks (
Optional
[Sequence
[str
]]) – Nodes will undergo advanced stress test to detect and replace faulty instances, based on the type of deep health check(s) passed in.override_vpc_config (
Union
[IResolvable
,VpcConfigProperty
,Dict
[str
,Any
],None
]) – Specifies an Amazon Virtual Private Cloud (VPC) that your SageMaker jobs, hosted models, and compute resources have access to. You can control access to and from your resources by configuring a VPC.threads_per_core (
Union
[int
,float
,None
]) – The number you specified to TreadsPerCore in CreateCluster for enabling or disabling multithreading. For instance types that support multithreading, you can specify 1 for disabling multithreading and 2 for enabling multithreading.training_plan_arn (
Optional
[str
]) – The Amazon Resource Name (ARN) of the training plan to use for this cluster restricted instance group. For more information about how to reserve GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see CreateTrainingPlan.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker cluster_restricted_instance_group_property = sagemaker.CfnCluster.ClusterRestrictedInstanceGroupProperty( environment_config=sagemaker.CfnCluster.EnvironmentConfigProperty( f_sx_lustre_config=sagemaker.CfnCluster.FSxLustreConfigProperty( per_unit_storage_throughput=123, size_in_gi_b=123 ) ), execution_role="executionRole", instance_count=123, instance_group_name="instanceGroupName", instance_type="instanceType", # the properties below are optional current_count=123, instance_storage_configs=[sagemaker.CfnCluster.ClusterInstanceStorageConfigProperty( ebs_volume_config=sagemaker.CfnCluster.ClusterEbsVolumeConfigProperty( volume_size_in_gb=123 ) )], on_start_deep_health_checks=["onStartDeepHealthChecks"], override_vpc_config=sagemaker.CfnCluster.VpcConfigProperty( security_group_ids=["securityGroupIds"], subnets=["subnets"] ), threads_per_core=123, training_plan_arn="trainingPlanArn" )
Attributes
- current_count
The number of instances that are currently in the restricted instance group of a SageMaker HyperPod cluster.
- environment_config
The configuration for the restricted instance groups (RIG) environment.
- execution_role
The execution role for the instance group to assume.
- instance_count
The number of instances you specified to add to the restricted instance group of a SageMaker HyperPod cluster.
- instance_group_name
The name of the instance group of a SageMaker HyperPod cluster.
- instance_storage_configs
The instance storage configuration for the instance group.
- instance_type
The instance type of the instance group of a SageMaker HyperPod cluster.
- on_start_deep_health_checks
Nodes will undergo advanced stress test to detect and replace faulty instances, based on the type of deep health check(s) passed in.
- override_vpc_config
Specifies an Amazon Virtual Private Cloud (VPC) that your SageMaker jobs, hosted models, and compute resources have access to.
You can control access to and from your resources by configuring a VPC.
- threads_per_core
The number you specified to TreadsPerCore in CreateCluster for enabling or disabling multithreading.
For instance types that support multithreading, you can specify 1 for disabling multithreading and 2 for enabling multithreading.
- training_plan_arn
The Amazon Resource Name (ARN) of the training plan to use for this cluster restricted instance group.
For more information about how to reserve GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see CreateTrainingPlan.
DeploymentConfigProperty
- class CfnCluster.DeploymentConfigProperty(*, auto_rollback_configuration=None, rolling_update_policy=None, wait_interval_in_seconds=None)
Bases:
object
The deployment configuration for an endpoint, which contains the desired deployment strategy and rollback configurations.
- Parameters:
auto_rollback_configuration (
Union
[IResolvable
,Sequence
[Union
[IResolvable
,AlarmDetailsProperty
,Dict
[str
,Any
]]],None
]) – Automatic rollback configuration for handling endpoint deployment failures and recovery.rolling_update_policy (
Union
[IResolvable
,RollingUpdatePolicyProperty
,Dict
[str
,Any
],None
]) – Specifies a rolling deployment strategy for updating a SageMaker endpoint.wait_interval_in_seconds (
Union
[int
,float
,None
]) – The duration in seconds that SageMaker waits before updating more instances in the cluster.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker deployment_config_property = sagemaker.CfnCluster.DeploymentConfigProperty( auto_rollback_configuration=[sagemaker.CfnCluster.AlarmDetailsProperty( alarm_name="alarmName" )], rolling_update_policy=sagemaker.CfnCluster.RollingUpdatePolicyProperty( maximum_batch_size=sagemaker.CfnCluster.CapacitySizeConfigProperty( type="type", value=123 ), # the properties below are optional rollback_maximum_batch_size=sagemaker.CfnCluster.CapacitySizeConfigProperty( type="type", value=123 ) ), wait_interval_in_seconds=123 )
Attributes
- auto_rollback_configuration
Automatic rollback configuration for handling endpoint deployment failures and recovery.
- rolling_update_policy
Specifies a rolling deployment strategy for updating a SageMaker endpoint.
- wait_interval_in_seconds
The duration in seconds that SageMaker waits before updating more instances in the cluster.
EnvironmentConfigProperty
- class CfnCluster.EnvironmentConfigProperty(*, f_sx_lustre_config=None)
Bases:
object
The configuration for the restricted instance groups (RIG) environment.
- Parameters:
f_sx_lustre_config (
Union
[IResolvable
,FSxLustreConfigProperty
,Dict
[str
,Any
],None
]) – Configuration settings for an Amazon FSx for Lustre file system to be used with the cluster.- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker environment_config_property = sagemaker.CfnCluster.EnvironmentConfigProperty( f_sx_lustre_config=sagemaker.CfnCluster.FSxLustreConfigProperty( per_unit_storage_throughput=123, size_in_gi_b=123 ) )
Attributes
- f_sx_lustre_config
Configuration settings for an Amazon FSx for Lustre file system to be used with the cluster.
FSxLustreConfigProperty
- class CfnCluster.FSxLustreConfigProperty(*, per_unit_storage_throughput, size_in_gib)
Bases:
object
Configuration settings for an Amazon FSx for Lustre file system to be used with the cluster.
- Parameters:
per_unit_storage_throughput (
Union
[int
,float
]) – The throughput capacity of the Amazon FSx for Lustre file system, measured in MB/s per TiB of storage.size_in_gib (
Union
[int
,float
]) – The storage capacity of the Amazon FSx for Lustre file system, specified in gibibytes (GiB).
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker f_sx_lustre_config_property = sagemaker.CfnCluster.FSxLustreConfigProperty( per_unit_storage_throughput=123, size_in_gi_b=123 )
Attributes
- per_unit_storage_throughput
The throughput capacity of the Amazon FSx for Lustre file system, measured in MB/s per TiB of storage.
- size_in_gib
The storage capacity of the Amazon FSx for Lustre file system, specified in gibibytes (GiB).
OrchestratorProperty
- class CfnCluster.OrchestratorProperty(*, eks)
Bases:
object
The orchestrator for a SageMaker HyperPod cluster.
- Parameters:
eks (
Union
[IResolvable
,ClusterOrchestratorEksConfigProperty
,Dict
[str
,Any
]]) – The configuration of the Amazon EKS orchestrator cluster for the SageMaker HyperPod cluster.- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker orchestrator_property = sagemaker.CfnCluster.OrchestratorProperty( eks=sagemaker.CfnCluster.ClusterOrchestratorEksConfigProperty( cluster_arn="clusterArn" ) )
Attributes
- eks
The configuration of the Amazon EKS orchestrator cluster for the SageMaker HyperPod cluster.
RollingUpdatePolicyProperty
- class CfnCluster.RollingUpdatePolicyProperty(*, maximum_batch_size, rollback_maximum_batch_size=None)
Bases:
object
Specifies a rolling deployment strategy for updating a SageMaker endpoint.
- Parameters:
maximum_batch_size (
Union
[IResolvable
,CapacitySizeConfigProperty
,Dict
[str
,Any
]]) – Batch size for each rolling step to provision capacity and turn on traffic on the new endpoint fleet, and terminate capacity on the old endpoint fleet. Value must be between 5% to 50% of the variant’s total instance count.rollback_maximum_batch_size (
Union
[IResolvable
,CapacitySizeConfigProperty
,Dict
[str
,Any
],None
]) – Batch size for rollback to the old endpoint fleet. Each rolling step to provision capacity and turn on traffic on the old endpoint fleet, and terminate capacity on the new endpoint fleet. If this field is absent, the default value will be set to 100% of total capacity which means to bring up the whole capacity of the old fleet at once during rollback.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker rolling_update_policy_property = sagemaker.CfnCluster.RollingUpdatePolicyProperty( maximum_batch_size=sagemaker.CfnCluster.CapacitySizeConfigProperty( type="type", value=123 ), # the properties below are optional rollback_maximum_batch_size=sagemaker.CfnCluster.CapacitySizeConfigProperty( type="type", value=123 ) )
Attributes
- maximum_batch_size
Batch size for each rolling step to provision capacity and turn on traffic on the new endpoint fleet, and terminate capacity on the old endpoint fleet.
Value must be between 5% to 50% of the variant’s total instance count.
- rollback_maximum_batch_size
Batch size for rollback to the old endpoint fleet.
Each rolling step to provision capacity and turn on traffic on the old endpoint fleet, and terminate capacity on the new endpoint fleet. If this field is absent, the default value will be set to 100% of total capacity which means to bring up the whole capacity of the old fleet at once during rollback.
ScheduledUpdateConfigProperty
- class CfnCluster.ScheduledUpdateConfigProperty(*, schedule_expression, deployment_config=None)
Bases:
object
The configuration object of the schedule that SageMaker follows when updating the AMI.
- Parameters:
schedule_expression (
str
) – A cron expression that specifies the schedule that SageMaker follows when updating the AMI.deployment_config (
Union
[IResolvable
,DeploymentConfigProperty
,Dict
[str
,Any
],None
]) – The configuration to use when updating the AMI versions.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker scheduled_update_config_property = sagemaker.CfnCluster.ScheduledUpdateConfigProperty( schedule_expression="scheduleExpression", # the properties below are optional deployment_config=sagemaker.CfnCluster.DeploymentConfigProperty( auto_rollback_configuration=[sagemaker.CfnCluster.AlarmDetailsProperty( alarm_name="alarmName" )], rolling_update_policy=sagemaker.CfnCluster.RollingUpdatePolicyProperty( maximum_batch_size=sagemaker.CfnCluster.CapacitySizeConfigProperty( type="type", value=123 ), # the properties below are optional rollback_maximum_batch_size=sagemaker.CfnCluster.CapacitySizeConfigProperty( type="type", value=123 ) ), wait_interval_in_seconds=123 ) )
Attributes
- deployment_config
The configuration to use when updating the AMI versions.
- schedule_expression
A cron expression that specifies the schedule that SageMaker follows when updating the AMI.
VpcConfigProperty
- class CfnCluster.VpcConfigProperty(*, security_group_ids, subnets)
Bases:
object
Specifies an Amazon Virtual Private Cloud (VPC) that your SageMaker jobs, hosted models, and compute resources have access to.
You can control access to and from your resources by configuring a VPC. For more information, see Give SageMaker Access to Resources in your Amazon VPC .
- Parameters:
security_group_ids (
Sequence
[str
]) – The VPC security group IDs, in the formsg-xxxxxxxx
. Specify the security groups for the VPC that is specified in theSubnets
field.subnets (
Sequence
[str
]) – The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones .
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker vpc_config_property = sagemaker.CfnCluster.VpcConfigProperty( security_group_ids=["securityGroupIds"], subnets=["subnets"] )
Attributes
- security_group_ids
The VPC security group IDs, in the form
sg-xxxxxxxx
.Specify the security groups for the VPC that is specified in the
Subnets
field.
- subnets
The ID of the subnets in the VPC to which you want to connect your training job or model.
For information about the availability of specific instance types, see Supported Instance Types and Availability Zones .