Block - Amazon Comprehend API Reference

Block

Information about each word or line of text in the input document.

For additional information, see Block in the Amazon Textract API reference.

Contents

BlockType

The block represents a line of text or one word of text.

  • WORD - A word that's detected on a document page. A word is one or more ISO basic Latin script characters that aren't separated by spaces.

  • LINE - A string of tab-delimited, contiguous words that are detected on a document page

Type: String

Valid Values: LINE | WORD

Required: No

Geometry

Co-ordinates of the rectangle or polygon that contains the text.

Type: Geometry object

Required: No

Id

Unique identifier for the block.

Type: String

Length Constraints: Minimum length of 1.

Required: No

Page

Page number where the block appears.

Type: Integer

Required: No

Relationships

A list of child blocks of the current block. For example, a LINE object has child blocks for each WORD block that's part of the line of text.

Type: Array of RelationshipsListItem objects

Required: No

Text

The word or line of text extracted from the block.

Type: String

Length Constraints: Minimum length of 1.

Required: No

See Also

For more information about using this API in one of the language-specific AWS SDKs, see the following: