S3DataSource

class aws_cdk.aws_stepfunctions_tasks.S3DataSource(*, s3_location, attribute_names=None, s3_data_distribution_type=None, s3_data_type=None)

Bases: object

S3 location of the channel data.

Parameters:
  • s3_location (S3Location) – S3 Uri.

  • attribute_names (Optional[Sequence[str]]) – List of one or more attribute names to use that are found in a specified augmented manifest file. Default: - No attribute names

  • s3_data_distribution_type (Optional[S3DataDistributionType]) – S3 Data Distribution Type. Default: - None

  • s3_data_type (Optional[S3DataType]) – S3 Data Type. Default: S3_PREFIX

See:

https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_S3DataSource.html

ExampleMetadata:

infused

Example:

tasks.SageMakerCreateTrainingJob(self, "TrainSagemaker",
    training_job_name=sfn.JsonPath.string_at("$.JobName"),
    algorithm_specification=tasks.AlgorithmSpecification(
        algorithm_name="BlazingText",
        training_input_mode=tasks.InputMode.FILE
    ),
    input_data_config=[tasks.Channel(
        channel_name="train",
        data_source=tasks.DataSource(
            s3_data_source=tasks.S3DataSource(
                s3_data_type=tasks.S3DataType.S3_PREFIX,
                s3_location=tasks.S3Location.from_json_expression("$.S3Bucket")
            )
        )
    )],
    output_data_config=tasks.OutputDataConfig(
        s3_output_location=tasks.S3Location.from_bucket(s3.Bucket.from_bucket_name(self, "Bucket", "mybucket"), "myoutputpath")
    ),
    resource_config=tasks.ResourceConfig(
        instance_count=1,
        instance_type=ec2.InstanceType(sfn.JsonPath.string_at("$.InstanceType")),
        volume_size=Size.gibibytes(50)
    ),  # optional: default is 1 instance of EC2 `M4.XLarge` with `10GB` volume
    stopping_condition=tasks.StoppingCondition(
        max_runtime=Duration.hours(2)
    )
)

Attributes

attribute_names

List of one or more attribute names to use that are found in a specified augmented manifest file.

Default:
  • No attribute names

s3_data_distribution_type

S3 Data Distribution Type.

Default:
  • None

s3_data_type

S3 Data Type.

Default:

S3_PREFIX

s3_location

S3 Uri.