Menu
Amazon Comprehend
Developer Guide

Guidelines and Limits

Keep in mind the following information when using Amazon Comprehend.

Supported Regions

For a list of AWS Regions where Amazon Comprehend is availabe, see AWS Regions and Endpoints in the Amazon Web Services General Reference.

Throttling

For information about throttling for Amazon Comprehend and to request a limit increase, see Amazon Comprehend Limits in the Amazon Web Services General Reference.

You may be able to avoid throttling by using the batch operations instead of the single transaction operations. For more information, see Multiple Document Operations.

Overall Limits

All operations except asynchronous operations and topic modeling operations have the following limits:

Description Limit
Character encoding UTF-8
Document size (UTF-8 characters) 5,000 bytes

Amazon Comprehend may store your content to continuously improve the quality of its analysis models. See the Amazon Comprehend FAQ to learn more. To request that we delete content that may have been stored by Amazon Comprehend, open a case with AWS Support.

Multiple Document Operations

The BatchDetectDominantLanguage, BatchDetectEntities, BatchDetectKeyPhrases, and BatchDetectSentiment operations have the following limits:

Description Limit
Documents per request 25

If you plan to send more than 20 requests per second, you should consider using the batch operations. Batch operations enable you to send more documents in each request which may result in higher throughput. For example, when you use the DetectDominantLanguage operation, you can send up to 20 documents per second. However, if you use the BatchRequestDominantLanguage operation, you can send up to 250 documents per second, but processing speed may be lower. For more information about throttling limits see Amazon Comprehend Limits in the Amazon Web Services General Reference. For more information about using the multiple document APIs, see Multiple Document Synchronous Processing.

Asynchronous Operations

Asynchronous batches started with the StartDominantLanguageDetectionJob, StartEntitiesDetectionJob, StartKeyPhrasesDetectionJob, and StartSentimentDetectionJob have the following limits:

Description Limit
Maximum size (UTF-8 characters) for one document, entity and key phrase detection 100 KB
Maximum size (UTF-8 characters) for one document, language detection 1 MB
Maximum size (UTF-8 characters) for one document, sentiment detection 5 KB
Total size of all files in batch 5 Gb
Maximum number of files, one document per file 1,000,000
Maximum number of lines, one document per line 1,000,000

You should use the asynchronous operations:

  • To analyze more than 25 documents at a time

  • To analyze documents larger than 5,000 bytes for keywords and entities

For more information, see Asynchronous Batch Processing.

Language Detection

The BatchDetectDominantLanguage, DetectDominantLanguage operations and asynchronous jobs started with the StartDominantLanguageDetectionJob operation have the following limitations:

  • They don't support phonetic language detection. For example, they will not detect "arigato" as Japanese nor "nihao" as Chinese.

  • They may have trouble distinguishing close language pairs, such as Indonesian and Malay; or Bosnian, Croatian, and Serbian.

  • For best results the input text should be at least 20 characters long.

Topic Modeling

Topic detection jobs created with the StartTopicsDetectionJob operation have the following limits:

Description Limit
Character encoding UTF-8
Maximum number of topics to return 100
Total size of all files in request 5 Gb
Maximum file size for one file, one document per file 100 Mb
Maximum number of files, one document per file 1,000,000
Maximum number of lines, one document per line 1,000,000

For best results, you should include at least 1,000 input documents.