Class: Aws::SageMaker::Types::DatasetDefinition

Inherits:
Struct
  • Object
show all
Defined in:
gems/aws-sdk-sagemaker/lib/aws-sdk-sagemaker/types.rb

Overview

Note:

When making an API call, you may pass DatasetDefinition data as a hash:

{
  athena_dataset_definition: {
    catalog: "AthenaCatalog", # required
    database: "AthenaDatabase", # required
    query_string: "AthenaQueryString", # required
    work_group: "AthenaWorkGroup",
    output_s3_uri: "S3Uri", # required
    kms_key_id: "KmsKeyId",
    output_format: "PARQUET", # required, accepts PARQUET, ORC, AVRO, JSON, TEXTFILE
    output_compression: "GZIP", # accepts GZIP, SNAPPY, ZLIB
  },
  redshift_dataset_definition: {
    cluster_id: "RedshiftClusterId", # required
    database: "RedshiftDatabase", # required
    db_user: "RedshiftUserName", # required
    query_string: "RedshiftQueryString", # required
    cluster_role_arn: "RoleArn", # required
    output_s3_uri: "S3Uri", # required
    kms_key_id: "KmsKeyId",
    output_format: "PARQUET", # required, accepts PARQUET, CSV
    output_compression: "None", # accepts None, GZIP, BZIP2, ZSTD, SNAPPY
  },
  local_path: "ProcessingLocalPath",
  data_distribution_type: "FullyReplicated", # accepts FullyReplicated, ShardedByS3Key
  input_mode: "Pipe", # accepts Pipe, File
}

Configuration for Dataset Definition inputs. The Dataset Definition input must specify exactly one of either AthenaDatasetDefinition or RedshiftDatasetDefinition types.

Constant Summary collapse

SENSITIVE =
[]

Instance Attribute Summary collapse

Instance Attribute Details

#athena_dataset_definitionTypes::AthenaDatasetDefinition

Configuration for Athena Dataset Definition input.



10426
10427
10428
10429
10430
10431
10432
10433
10434
# File 'gems/aws-sdk-sagemaker/lib/aws-sdk-sagemaker/types.rb', line 10426

class DatasetDefinition < Struct.new(
  :athena_dataset_definition,
  :redshift_dataset_definition,
  :local_path,
  :data_distribution_type,
  :input_mode)
  SENSITIVE = []
  include Aws::Structure
end

#data_distribution_typeString

Whether the generated dataset is FullyReplicated or ShardedByS3Key (default).

Returns:

  • (String)


10426
10427
10428
10429
10430
10431
10432
10433
10434
# File 'gems/aws-sdk-sagemaker/lib/aws-sdk-sagemaker/types.rb', line 10426

class DatasetDefinition < Struct.new(
  :athena_dataset_definition,
  :redshift_dataset_definition,
  :local_path,
  :data_distribution_type,
  :input_mode)
  SENSITIVE = []
  include Aws::Structure
end

#input_modeString

Whether to use File or Pipe input mode. In File (default) mode, Amazon SageMaker copies the data from the input source onto the local Amazon Elastic Block Store (Amazon EBS) volumes before starting your training algorithm. This is the most commonly used input mode. In Pipe mode, Amazon SageMaker streams input data from the source directly to your algorithm without using the EBS volume.

Returns:

  • (String)


10426
10427
10428
10429
10430
10431
10432
10433
10434
# File 'gems/aws-sdk-sagemaker/lib/aws-sdk-sagemaker/types.rb', line 10426

class DatasetDefinition < Struct.new(
  :athena_dataset_definition,
  :redshift_dataset_definition,
  :local_path,
  :data_distribution_type,
  :input_mode)
  SENSITIVE = []
  include Aws::Structure
end

#local_pathString

The local path where you want Amazon SageMaker to download the Dataset Definition inputs to run a processing job. LocalPath is an absolute path to the input data. This is a required parameter when AppManaged is False (default).

Returns:

  • (String)


10426
10427
10428
10429
10430
10431
10432
10433
10434
# File 'gems/aws-sdk-sagemaker/lib/aws-sdk-sagemaker/types.rb', line 10426

class DatasetDefinition < Struct.new(
  :athena_dataset_definition,
  :redshift_dataset_definition,
  :local_path,
  :data_distribution_type,
  :input_mode)
  SENSITIVE = []
  include Aws::Structure
end

#redshift_dataset_definitionTypes::RedshiftDatasetDefinition

Configuration for Redshift Dataset Definition input.



10426
10427
10428
10429
10430
10431
10432
10433
10434
# File 'gems/aws-sdk-sagemaker/lib/aws-sdk-sagemaker/types.rb', line 10426

class DatasetDefinition < Struct.new(
  :athena_dataset_definition,
  :redshift_dataset_definition,
  :local_path,
  :data_distribution_type,
  :input_mode)
  SENSITIVE = []
  include Aws::Structure
end