Real-time analysis for custom entity recognition (API) - Amazon Comprehend

Real-time analysis for custom entity recognition (API)

You can use the Amazon Comprehend API to run real-time analysis with a custom model. First, you create an endpoint to run the real-time analysis. After you create the endpoint, you run the real-time analysis.

For information about provisioning endpoint throughput, and the associated costs, see Using Amazon Comprehend endpoints.

Creating an endpoint for custom entity detection

For information about the costs associated with endpoints, see Using Amazon Comprehend endpoints.

Creating an Endpoint with the AWS CLI

To create an endpoint by using the AWS CLI, use the create-endpoint command:

$ aws comprehend create-endpoint \ > --desired-inference-units number of inference units \ > --endpoint-name endpoint name \ > --model-arn arn:aws:comprehend:region:account-id:model/example \ > --tags Key=Key,Value=Value

If your command succeeds, Amazon Comprehend responds with the endpoint ARN:

{ "EndpointArn": "Arn" }

For more information about this command, its parameter arguments, and its output, see create-endpoint in the AWS CLI Command Reference.

Running real-time custom entity detection

After you create an endpoint for your custom entity recognizer model, you use the endpoint to run the DetectEntities API operation. You can provide text input using the text or bytes parameter. Enter the other input types using the bytes parameter.

For image files and PDF files, you can use the DocumentReaderConfig parameter to override the default text extraction actions. For details, see Setting text extraction options.

Detecting entities in text using the AWS CLI

To detect custom entities in text, run the detect-entities command with the input text in the text parameter.

Example : Use the CLI to detect entities in input text
$ aws comprehend detect-entities \ > --endpoint-arn arn \ > --language-code en \ > --text "Andy Jassy is the CEO of Amazon."

If your command succeeds, Amazon Comprehend responds with the analysis. For each entity that Amazon Comprehend detects, it provides the entity type, text, location, and confidence score.

Detecting entities in semi-structured documents using the AWS CLI

To detect custom entities in PDF, Word, or image file, run the detect-entities command with the input file in the bytes parameter.

Example : Use the CLI to detect entities in an image file

This example shows how to pass in the image file using the fileb option to base64 encode the image bytes. For more information, see Binary large objects in the AWS Command Line Interface User Guide.

This example also passes in a JSON file named config.json to set the text extraction options.

$ aws comprehend detect-entities \ > --endpoint-arn arn \ > --language-code en \ > --bytes fileb://image1.jpg \ > --document-reader-config file://config.json

The config.json file contains the following content.

{ "DocumentReadMode": "FORCE_DOCUMENT_READ_ACTION", "DocumentReadAction": "TEXTRACT_DETECT_DOCUMENT_TEXT" }

For more information about the command syntax, see DetectEntities in the Amazon Comprehend API Reference.