Class: Aws::Comprehend::Types::InputDataConfig

Inherits:
Struct
  • Object
show all
Defined in:
gems/aws-sdk-comprehend/lib/aws-sdk-comprehend/types.rb

Overview

Note:

When making an API call, you may pass InputDataConfig data as a hash:

{
  s3_uri: "S3Uri", # required
  input_format: "ONE_DOC_PER_FILE", # accepts ONE_DOC_PER_FILE, ONE_DOC_PER_LINE
  document_reader_config: {
    document_read_action: "TEXTRACT_DETECT_DOCUMENT_TEXT", # required, accepts TEXTRACT_DETECT_DOCUMENT_TEXT, TEXTRACT_ANALYZE_DOCUMENT
    document_read_mode: "SERVICE_DEFAULT", # accepts SERVICE_DEFAULT, FORCE_DOCUMENT_READ_ACTION
    feature_types: ["TABLES"], # accepts TABLES, FORMS
  },
}

The input properties for an inference job.

Constant Summary collapse

SENSITIVE =
[]

Instance Attribute Summary collapse

Instance Attribute Details

#document_reader_configTypes::DocumentReaderConfig

The document reader config field applies only for InputDataConfig of StartEntitiesDetectionJob.

Use DocumentReaderConfig to provide specifications about how you want your inference documents read. Currently it applies for PDF documents in StartEntitiesDetectionJob custom inference.



3700
3701
3702
3703
3704
3705
3706
# File 'gems/aws-sdk-comprehend/lib/aws-sdk-comprehend/types.rb', line 3700

class InputDataConfig < Struct.new(
  :s3_uri,
  :input_format,
  :document_reader_config)
  SENSITIVE = []
  include Aws::Structure
end

#input_formatString

Specifies how the text in an input file should be processed:

  • ONE_DOC_PER_FILE - Each file is considered a separate document. Use this option when you are processing large documents, such as newspaper articles or scientific papers.

  • ONE_DOC_PER_LINE - Each line in a file is considered a separate document. Use this option when you are processing many short documents, such as text messages.

Returns:

  • (String)


3700
3701
3702
3703
3704
3705
3706
# File 'gems/aws-sdk-comprehend/lib/aws-sdk-comprehend/types.rb', line 3700

class InputDataConfig < Struct.new(
  :s3_uri,
  :input_format,
  :document_reader_config)
  SENSITIVE = []
  include Aws::Structure
end

#s3_uriString

The Amazon S3 URI for the input data. The URI must be in same region as the API endpoint that you are calling. The URI can point to a single input file or it can provide the prefix for a collection of data files.

For example, if you use the URI S3://bucketName/prefix, if the prefix is a single file, Amazon Comprehend uses that file as input. If more than one file begins with the prefix, Amazon Comprehend uses all of them as input.

Returns:

  • (String)


3700
3701
3702
3703
3704
3705
3706
# File 'gems/aws-sdk-comprehend/lib/aws-sdk-comprehend/types.rb', line 3700

class InputDataConfig < Struct.new(
  :s3_uri,
  :input_format,
  :document_reader_config)
  SENSITIVE = []
  include Aws::Structure
end