InputDataConfig - Amazon Comprehend

InputDataConfig

The input properties for an inference job.

Contents

DocumentReaderConfig

The document reader config field applies only for InputDataConfig of StartEntitiesDetectionJob.

Use DocumentReaderConfig to provide specifications about how you want your inference documents read. Currently it applies for PDF documents in StartEntitiesDetectionJob custom inference.

Type: DocumentReaderConfig object

Required: No

InputFormat

Specifies how the text in an input file should be processed:

  • ONE_DOC_PER_FILE - Each file is considered a separate document. Use this option when you are processing large documents, such as newspaper articles or scientific papers.

  • ONE_DOC_PER_LINE - Each line in a file is considered a separate document. Use this option when you are processing many short documents, such as text messages.

Type: String

Valid Values: ONE_DOC_PER_FILE | ONE_DOC_PER_LINE

Required: No

S3Uri

The Amazon S3 URI for the input data. The URI must be in same region as the API endpoint that you are calling. The URI can point to a single input file or it can provide the prefix for a collection of data files.

For example, if you use the URI S3://bucketName/prefix, if the prefix is a single file, Amazon Comprehend uses that file as input. If more than one file begins with the prefix, Amazon Comprehend uses all of them as input.

Type: String

Length Constraints: Maximum length of 1024.

Pattern: s3://[a-z0-9][\.\-a-z0-9]{1,61}[a-z0-9](/.*)?

Required: Yes

See Also

For more information about using this API in one of the language-specific AWS SDKs, see the following: