SageMakerCreateTrainingJobProps

class aws_cdk.aws_stepfunctions_tasks.SageMakerCreateTrainingJobProps(*, comment=None, heartbeat=None, input_path=None, integration_pattern=None, output_path=None, result_path=None, result_selector=None, timeout=None, algorithm_specification, input_data_config, output_data_config, training_job_name, enable_network_isolation=None, environment=None, hyperparameters=None, resource_config=None, role=None, stopping_condition=None, tags=None, vpc_config=None)

Bases: TaskStateBaseProps

Properties for creating an Amazon SageMaker training job.

Parameters:
  • comment (Optional[str]) – An optional description for this state. Default: - No comment

  • heartbeat (Optional[Duration]) – Timeout for the heartbeat. Default: - None

  • input_path (Optional[str]) – JSONPath expression to select part of the state to be the input to this state. May also be the special value JsonPath.DISCARD, which will cause the effective input to be the empty object {}. Default: - The entire task input (JSON path ‘$’)

  • integration_pattern (Optional[IntegrationPattern]) – AWS Step Functions integrates with services directly in the Amazon States Language. You can control these AWS services using service integration patterns Default: - IntegrationPattern.REQUEST_RESPONSE for most tasks. IntegrationPattern.RUN_JOB for the following exceptions: BatchSubmitJob, EmrAddStep, EmrCreateCluster, EmrTerminationCluster, and EmrContainersStartJobRun.

  • output_path (Optional[str]) – JSONPath expression to select select a portion of the state output to pass to the next state. May also be the special value JsonPath.DISCARD, which will cause the effective output to be the empty object {}. Default: - The entire JSON node determined by the state input, the task result, and resultPath is passed to the next state (JSON path ‘$’)

  • result_path (Optional[str]) – JSONPath expression to indicate where to inject the state’s output. May also be the special value JsonPath.DISCARD, which will cause the state’s input to become its output. Default: - Replaces the entire input with the result (JSON path ‘$’)

  • result_selector (Optional[Mapping[str, Any]]) – The JSON that will replace the state’s raw result and become the effective result before ResultPath is applied. You can use ResultSelector to create a payload with values that are static or selected from the state’s raw result. Default: - None

  • timeout (Optional[Duration]) – Timeout for the state machine. Default: - None

  • algorithm_specification (Union[AlgorithmSpecification, Dict[str, Any]]) – Identifies the training algorithm to use.

  • input_data_config (Sequence[Union[Channel, Dict[str, Any]]]) – Describes the various datasets (e.g. train, validation, test) and the Amazon S3 location where stored.

  • output_data_config (Union[OutputDataConfig, Dict[str, Any]]) – Identifies the Amazon S3 location where you want Amazon SageMaker to save the results of model training.

  • training_job_name (str) – Training Job Name.

  • enable_network_isolation (Optional[bool]) – Isolates the training container. No inbound or outbound network calls can be made to or from the training container. Default: false

  • environment (Optional[Mapping[str, str]]) – Environment variables to set in the Docker container. Default: - No environment variables

  • hyperparameters (Optional[Mapping[str, Any]]) – Algorithm-specific parameters that influence the quality of the model. Set hyperparameters before you start the learning process. For a list of hyperparameters provided by Amazon SageMaker Default: - No hyperparameters

  • resource_config (Union[ResourceConfig, Dict[str, Any], None]) – Specifies the resources, ML compute instances, and ML storage volumes to deploy for model training. Default: - 1 instance of EC2 M4.XLarge with 10GB volume

  • role (Optional[IRole]) – Role for the Training Job. The role must be granted all necessary permissions for the SageMaker training job to be able to operate. See https://docs.aws.amazon.com/fr_fr/sagemaker/latest/dg/sagemaker-roles.html#sagemaker-roles-createtrainingjob-perms Default: - a role will be created.

  • stopping_condition (Union[StoppingCondition, Dict[str, Any], None]) – Sets a time limit for training. Default: - max runtime of 1 hour

  • tags (Optional[Mapping[str, str]]) – Tags to be applied to the train job. Default: - No tags

  • vpc_config (Union[VpcConfig, Dict[str, Any], None]) – Specifies the VPC that you want your training job to connect to. Default: - No VPC

ExampleMetadata:

infused

Example:

tasks.SageMakerCreateTrainingJob(self, "TrainSagemaker",
    training_job_name=sfn.JsonPath.string_at("$.JobName"),
    algorithm_specification=tasks.AlgorithmSpecification(
        algorithm_name="BlazingText",
        training_input_mode=tasks.InputMode.FILE
    ),
    input_data_config=[tasks.Channel(
        channel_name="train",
        data_source=tasks.DataSource(
            s3_data_source=tasks.S3DataSource(
                s3_data_type=tasks.S3DataType.S3_PREFIX,
                s3_location=tasks.S3Location.from_json_expression("$.S3Bucket")
            )
        )
    )],
    output_data_config=tasks.OutputDataConfig(
        s3_output_location=tasks.S3Location.from_bucket(s3.Bucket.from_bucket_name(self, "Bucket", "mybucket"), "myoutputpath")
    ),
    resource_config=tasks.ResourceConfig(
        instance_count=1,
        instance_type=ec2.InstanceType(sfn.JsonPath.string_at("$.InstanceType")),
        volume_size=Size.gibibytes(50)
    ),  # optional: default is 1 instance of EC2 `M4.XLarge` with `10GB` volume
    stopping_condition=tasks.StoppingCondition(
        max_runtime=Duration.hours(2)
    )
)

Attributes

algorithm_specification

Identifies the training algorithm to use.

comment

An optional description for this state.

Default:
  • No comment

enable_network_isolation

Isolates the training container.

No inbound or outbound network calls can be made to or from the training container.

Default:

false

environment

Environment variables to set in the Docker container.

Default:
  • No environment variables

heartbeat

Timeout for the heartbeat.

Default:
  • None

hyperparameters

Algorithm-specific parameters that influence the quality of the model.

Set hyperparameters before you start the learning process. For a list of hyperparameters provided by Amazon SageMaker

Default:
  • No hyperparameters

See:

https://docs.aws.amazon.com/sagemaker/latest/dg/algos.html

input_data_config

Describes the various datasets (e.g. train, validation, test) and the Amazon S3 location where stored.

input_path

JSONPath expression to select part of the state to be the input to this state.

May also be the special value JsonPath.DISCARD, which will cause the effective input to be the empty object {}.

Default:
  • The entire task input (JSON path ‘$’)

integration_pattern

AWS Step Functions integrates with services directly in the Amazon States Language.

You can control these AWS services using service integration patterns

Default:

  • IntegrationPattern.REQUEST_RESPONSE for most tasks.

IntegrationPattern.RUN_JOB for the following exceptions: BatchSubmitJob, EmrAddStep, EmrCreateCluster, EmrTerminationCluster, and EmrContainersStartJobRun.

See:

https://docs.aws.amazon.com/step-functions/latest/dg/connect-to-resource.html#connect-wait-token

output_data_config

Identifies the Amazon S3 location where you want Amazon SageMaker to save the results of model training.

output_path

JSONPath expression to select select a portion of the state output to pass to the next state.

May also be the special value JsonPath.DISCARD, which will cause the effective output to be the empty object {}.

Default:

  • The entire JSON node determined by the state input, the task result,

and resultPath is passed to the next state (JSON path ‘$’)

resource_config

Specifies the resources, ML compute instances, and ML storage volumes to deploy for model training.

Default:
  • 1 instance of EC2 M4.XLarge with 10GB volume

result_path

JSONPath expression to indicate where to inject the state’s output.

May also be the special value JsonPath.DISCARD, which will cause the state’s input to become its output.

Default:
  • Replaces the entire input with the result (JSON path ‘$’)

result_selector

The JSON that will replace the state’s raw result and become the effective result before ResultPath is applied.

You can use ResultSelector to create a payload with values that are static or selected from the state’s raw result.

Default:
  • None

See:

https://docs.aws.amazon.com/step-functions/latest/dg/input-output-inputpath-params.html#input-output-resultselector

role

Role for the Training Job.

The role must be granted all necessary permissions for the SageMaker training job to be able to operate.

See https://docs.aws.amazon.com/fr_fr/sagemaker/latest/dg/sagemaker-roles.html#sagemaker-roles-createtrainingjob-perms

Default:
  • a role will be created.

stopping_condition

Sets a time limit for training.

Default:
  • max runtime of 1 hour

tags

Tags to be applied to the train job.

Default:
  • No tags

timeout

Timeout for the state machine.

Default:
  • None

training_job_name

Training Job Name.

vpc_config

Specifies the VPC that you want your training job to connect to.

Default:
  • No VPC