ProcessingS3Input - Amazon SageMaker

ProcessingS3Input

Configuration for downloading input data from Amazon S3 into the processing container.

Contents

S3DataType

Whether you use an S3Prefix or a ManifestFile for the data type. If you choose S3Prefix, S3Uri identifies a key name prefix. Amazon SageMaker uses all objects with the specified key name prefix for the processing job. If you choose ManifestFile, S3Uri identifies an object that is a manifest file containing a list of object keys that you want Amazon SageMaker to use for the processing job.

Type: String

Valid Values: ManifestFile | S3Prefix

Required: Yes

S3Uri

The URI of the Amazon S3 prefix Amazon SageMaker downloads data required to run a processing job.

Type: String

Length Constraints: Maximum length of 1024.

Pattern: ^(https|s3)://([^/]+)/?(.*)$

Required: Yes

LocalPath

The local path in your container where you want Amazon SageMaker to write input data to. LocalPath is an absolute path to the input data and must begin with /opt/ml/processing/. LocalPath is a required parameter when AppManaged is False (default).

Type: String

Length Constraints: Maximum length of 256.

Pattern: .*

Required: No

S3CompressionType

Whether to GZIP-decompress the data in Amazon S3 as it is streamed into the processing container. Gzip can only be used when Pipe mode is specified as the S3InputMode. In Pipe mode, Amazon SageMaker streams input data from the source directly to your container without using the EBS volume.

Type: String

Valid Values: None | Gzip

Required: No

S3DataDistributionType

Whether to distribute the data from Amazon S3 to all processing instances with FullyReplicated, or whether the data from Amazon S3 is shared by Amazon S3 key, downloading one shard of data to each processing instance.

Type: String

Valid Values: FullyReplicated | ShardedByS3Key

Required: No

S3InputMode

Whether to use File or Pipe input mode. In File mode, Amazon SageMaker copies the data from the input source onto the local ML storage volume before starting your processing container. This is the most commonly used input mode. In Pipe mode, Amazon SageMaker streams input data from the source directly to your processing container into named pipes without using the ML storage volume.

Type: String

Valid Values: Pipe | File

Required: No

See Also

For more information about using this API in one of the language-specific AWS SDKs, see the following: