DocumentReaderConfig
The input properties for a topic detection job.
Contents
- DocumentReadAction
-
This enum field will start with two values which will apply to PDFs:
-
TEXTRACT_DETECT_DOCUMENT_TEXT
- The service calls DetectDocumentText for PDF documents per page. -
TEXTRACT_ANALYZE_DOCUMENT
- The service calls AnalyzeDocument for PDF documents per page.
Type: String
Valid Values:
TEXTRACT_DETECT_DOCUMENT_TEXT | TEXTRACT_ANALYZE_DOCUMENT
Required: Yes
-
- DocumentReadMode
-
This enum field provides two values:
-
SERVICE_DEFAULT
- use service defaults for Document reading. For Digital PDF it would mean using an internal parser instead of Textract APIs -
FORCE_DOCUMENT_READ_ACTION
- Always use specified action for DocumentReadAction, including Digital PDF.
Type: String
Valid Values:
SERVICE_DEFAULT | FORCE_DOCUMENT_READ_ACTION
Required: No
-
- FeatureTypes
-
Specifies how the text in an input file should be processed:
Type: Array of strings
Array Members: Minimum number of 1 item. Maximum number of 2 items.
Valid Values:
TABLES | FORMS
Required: No
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: