interface DatasetDefinitionProperty
Language | Type name |
---|---|
![]() | Amazon.CDK.AWS.Sagemaker.CfnProcessingJob.DatasetDefinitionProperty |
![]() | github.com/aws/aws-cdk-go/awscdk/v2/awssagemaker#CfnProcessingJob_DatasetDefinitionProperty |
![]() | software.amazon.awscdk.services.sagemaker.CfnProcessingJob.DatasetDefinitionProperty |
![]() | aws_cdk.aws_sagemaker.CfnProcessingJob.DatasetDefinitionProperty |
![]() | aws-cdk-lib » aws_sagemaker » CfnProcessingJob » DatasetDefinitionProperty |
Configuration for Dataset Definition inputs.
The Dataset Definition input must specify exactly one of either AthenaDatasetDefinition
or RedshiftDatasetDefinition
types.
Example
// The code below shows an example of how to instantiate this type.
// The values are placeholders you should change.
import { aws_sagemaker as sagemaker } from 'aws-cdk-lib';
const datasetDefinitionProperty: sagemaker.CfnProcessingJob.DatasetDefinitionProperty = {
athenaDatasetDefinition: {
catalog: 'catalog',
database: 'database',
outputFormat: 'outputFormat',
outputS3Uri: 'outputS3Uri',
queryString: 'queryString',
// the properties below are optional
kmsKeyId: 'kmsKeyId',
outputCompression: 'outputCompression',
workGroup: 'workGroup',
},
dataDistributionType: 'dataDistributionType',
inputMode: 'inputMode',
localPath: 'localPath',
redshiftDatasetDefinition: {
clusterId: 'clusterId',
clusterRoleArn: 'clusterRoleArn',
database: 'database',
dbUser: 'dbUser',
outputFormat: 'outputFormat',
outputS3Uri: 'outputS3Uri',
queryString: 'queryString',
// the properties below are optional
kmsKeyId: 'kmsKeyId',
outputCompression: 'outputCompression',
},
};
Properties
Name | Type | Description |
---|---|---|
athena | IResolvable | Athena | Configuration for Athena Dataset Definition input. |
data | string | Whether the generated dataset is FullyReplicated or ShardedByS3Key (default). |
input | string | Whether to use File or Pipe input mode. |
local | string | The local path where you want Amazon SageMaker to download the Dataset Definition inputs to run a processing job. |
redshift | IResolvable | Redshift | Configuration for Redshift Dataset Definition input. |
athenaDatasetDefinition?
Type:
IResolvable
|
Athena
(optional)
Configuration for Athena Dataset Definition input.
dataDistributionType?
Type:
string
(optional)
Whether the generated dataset is FullyReplicated
or ShardedByS3Key
(default).
inputMode?
Type:
string
(optional)
Whether to use File
or Pipe
input mode.
In File
(default) mode, Amazon SageMaker copies the data from the input source onto the local Amazon Elastic Block Store (Amazon EBS) volumes before starting your training algorithm. This is the most commonly used input mode. In Pipe
mode, Amazon SageMaker streams input data from the source directly to your algorithm without using the EBS volume.
localPath?
Type:
string
(optional)
The local path where you want Amazon SageMaker to download the Dataset Definition inputs to run a processing job.
LocalPath
is an absolute path to the input data. This is a required parameter when AppManaged
is False
(default).
redshiftDatasetDefinition?
Type:
IResolvable
|
Redshift
(optional)
Configuration for Redshift Dataset Definition input.