Detecting Text

Amazon Textract provides synchronous and asynchronous operations that return only the text detected in a document. For both sets of operations, the following information is returned in multiple Block objects:

The lines and words of detected text
The relationships between the lines and words of detected text
The page that the detected text appears on
The location of the lines and words of text on the document page

For more information, see Lines and Words of Text.

To detect text synchronously, use the DetectDocumentText API operation, and pass a document file as input. The entire set of results is returned by the operation. For more information and an example, see Processing Documents Synchronously.

Note

The Amazon Rekognition API operation DetectText is different from DetectDocumentText. You use DetectText to detect text in live scenes, such as posters or road signs.

To detect text asynchronously, use StartDocumentTextDetection to start processing an input document file. To get the results, call GetDocumentTextDetection. The results are returned in one or more responses from GetDocumentTextDetection. For more information and an example, see Processing Documents Asynchronously.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Identifying Your Use Case

Analyzing Documents