Supported Regions Quotas for built-in models Quotas for custom models Quotas for flywheels

Guidelines and quotas

Unless otherwise specified, the Amazon Comprehend quotas are per region. You can request an increase to adjustable quotas if needed for your applications. For information about quotas and to request a quota increase, see AWS Service Quotas.

Supported Regions

Amazon Comprehend is available in the following AWS Regions:

US East (Ohio)
US East (N. Virginia)
US West (Oregon)
Asia Pacific (Mumbai)
Asia Pacific (Seoul)
Asia Pacific (Singapore)
Asia Pacific (Sydney)
Asia Pacific (Tokyo)
Canada (Central)
Europe (Frankfurt)
Europe (Ireland)
Europe (London)
AWS GovCloud (US-West)

By default, Amazon Comprehend provides all API operations in each of the supported regions. For exceptions, see Document processing.

For information about API endpoints, see Amazon Comprehend Regions and Endpoints in the Amazon Web Services General Reference.

To review current quotas in a region, or to request quota increases for adjustable quotas, open the Service Quotas console.

Quotas for built-in models

Amazon Comprehend provides built-in models for you to analyze UTF-8 text documents. Amazon Comprehend provides synchronous and asynchronous operations that use the built-in models.

Topics

Real-time (synchronous) analysis
Asynchronous analysis

Real-time (synchronous) analysis

This section describes quotas related to real-time analysis using the built-in models.

Topics

Single document operations
Multiple document operations
Request throttling for real-time (synchronous) requests

Single document operations

The Amazon Comprehend API provides operations that take a single document as input. The following quotas apply to these operations.

General quotas for single document operations

The following quotas apply to real-time analysis for detecting entities, key-phrases, or dominant language. For entity detection, these quotas apply to detection with the built-in models. For custom entity detection, see the quotas in Custom entity recognition .

Description	Quota/Guideline
Maximum document size	100 KB

Operation-specific quotas for single document operations

The following quotas apply to real-time analysis for detecting sentiment, targeted sentiment, and syntax.

Description	Quota/Guideline
Maximum document size	5 KB

Multiple document operations

The Amazon Comprehend API provides batch operations that process multiple documents with a single API request. The following quotas apply to the batch operations.

Description	Quota/Guideline
Maximum document size	5 KB
Maximum documents per request	25

For more information about using the batch document operations, see Multiple document synchronous processing.

Request throttling for real-time (synchronous) requests

Amazon Comprehend applies dynamic throttling to synchronous requests. If system processing bandwidth is available, Amazon Comprehend gradually increases the number of your requests that it processes. To control your application's usage of the synchronous API operations, we recommend that you turn on billing alerts or implement rate-limiting in your application.

Asynchronous analysis

This section describes quotas related to asynchronous analysis using the built-in models.

Asynchronous API operations each support a maximum of 10 active jobs. To view the quotas for each API operation, see the Service Quotas table in Amazon Comprehend endpoints and quotas in the Amazon Web Services General Reference.

For adjustable quotas, you can request a quota increase using the Service Quotas console.

Topics

General quotas for asynchronous operations
Operation-specific quotas for asynchronous jobs
Request throttling for asynchronous requests

General quotas for asynchronous operations

You can run asynchronous analysis jobs using the console or any of the API Start* operations. For information about when to use asynchronous operations, see Asynchronous batch processing. The following quotas apply to most of the API Start* operations for built-in models. For the exceptions, see Operation-specific quotas for asynchronous jobs.

Description	Quota/Guideline
Maximum size of each document in jobs that detect entities, key phrases, PII, and languages	1 MB
Maximum total size of all files in a request	5 GB
Minimum total size of all files in a request	500 bytes
Maximum number of files, one document per file	1,000,000
Maximum total number of lines, one document per line	1,000,000

Operation-specific quotas for asynchronous jobs

This section describes quotas for specific asynchronous operations. If a quota isn't specified in the following tables, the general quota value applies.

Topics

Sentiment
Targeted sentiment
Events
Topic modeling

Sentiment

Asynchronous sentiment jobs, which you create with the StartSentimentDetectionJob operation, have the following quotas.

Description	Quota/Guideline
Maximum size of each input document	5 KB

Targeted sentiment

Asynchronous targeted sentiment jobs, which you create with the StartTargetedSentimentDetectionJob operation, have the following quotas.

Description	Quota/Guideline
Supported document formats	UTF-8
Maximum size of each document in a job	10 KB
Maximum size of all documents in a job	300 MB
Maximum number of files, one document per file	30,000
Maximum total number of lines, one document per line (for all files in a request)	30,000

Events

Asynchronous events detection jobs, which you create with the StartEventsDetectionJob operation, have the following quotas.

Description	Quotas
Character encoding	UTF-8
Total size of all files in a job	50 MB
Maximum size of each document in a job	10 KB
Maximum number of files, one document per file	5,000
Maximum total number of lines, one document per line (for all files in request)	5,000

Topic modeling

Asynchronous topic modeling jobs, which you create with the StartTopicsDetectionJob operation, have the following quotas.

Description	Quota/Guideline
Character encoding	UTF-8
Maximum number of topics to return	100
Maximum file size for one file, one document per file	100 MB

For more information, see Topic modeling

Request throttling for asynchronous requests

Each asynchronous API operation supports a maximum number of requests per second (per region, per account), and also a maximum of 10 active jobs. To view the quotas for each API operation, see the Service Quotas table in Amazon Comprehend endpoints and quotas in the Amazon Web Services General Reference.

For adjustable quotas, you can request a quota increase using the Service Quotas console.

Quotas for custom models

You can use Amazon Comprehend to build your own custom models for custom classification and custom entity recognition. This section provides the guidelines and quotas related to training and using custom models. For more information about custom models, see Amazon Comprehend Custom.

Topics

General quotas
Quotas for endpoints
Document classification
Custom entity recognition

General quotas

Amazon Comprehend sets general size quotas for each type of input document that you can analyze with custom models. For real-time analysis quotas, see Maximum document sizes for real-time analysis. For asynchronous analysis quotas, see Inputs for asynchronous custom analysis.

For adjustable quotas, you can request a quota increase using the Service Quotas console.

Quotas for endpoints

You create an endpoint to run real-time analysis with a custom model. For information about endpoints, see Managing Amazon Comprehend endpoints.

The following quotas apply to the endpoints. For information about how to request a quota increase, see AWS Service Quotas.

Description	Quota/Guideline
Maximum number of active endpoints per Region for each account	20
Maximum number of inference units per Region for each account	200
Maximum number of inference units per endpoint per region	50
Maximum throughput per inference unit (characters)	100/second
Maximum throughput per inference unit (documents)	2/second

Document classification

This section describes the guidelines and quotas for the following document classification operations:

Classifier training jobs that you start with the CreateDocumentClassifier operation.
Asynchronous document classification jobs that you start with the StartDocumentClassificationJob operation.
Synchronous document classification requests that use the ClassifyDocument operation.

General quotas for document classification

The following table describes general quotas related to training custom classifiers.

Description	Quota/Guideline
Maximum length of class name	5,000 characters
Number of classes (multi-class mode)	2–1,000
Number of classes (multi-label mode)	2–100
Annotations format
Minimum number of annotations per class (multi-class mode)	10
Minimum number of annotations per class (multi-label mode)	10
Minimum number of annotations (multi-label mode)	50
CSV file format
Minimum number of training documents per class (multi-class mode)	50
Minimum number of training documents per class (multi-label mode)	10
Minimum number of training documents (multi-label mode)	50

Classification for plain text documents

You create and train a plain-text model using plain-text input documents. Amazon Comprehend provides real-time and asynchronous operations to classify plain text documents using a plain-text model.

Training

The following table describes quotas related to training a custom classifier with plain text documents.

Description	Quota/Guideline
Total size of all files in training job	5 GB
Maximum number of augmented manifest files for training a custom classifier	5
Maximum number of attribute names for each augmented manifest file	5
Maximum length of attribute name	63 characters

Real-time (synchronous) analysis

The following table describes quotas related to real-time classification of plain text documents.

Description	Quota/Guideline
Maximum number of documents per synchronous request	1
Maximum text document size (UTF-8 encoded)	10 KB

Asynchronous analysis

The following table describes quotas related to asynchronous classification of plain text documents.

Description	Quota/Guideline
Total size of all files in asynchronous job	5 GB
Maximum file size for one file, one document per file	10 MB
Maximum number of files, one document per file	1,000,000
Maximum total number of lines, one document per line (for all files in request)	1,000,000

Classification for semi-structured documents

This section describes the guidelines and quotas for document classification of semi-structured documents. To classify semi-structured documents, use a native document model that you trained with native input documents.

Training a native document model with semi-structured docs

The following table describes quotas related to training a custom classifier with semi-structured documents, such as PDF documents, Word documents, and image files.

Description	Quota/Guideline
Maximum number of pages across all documents	10,000
Maximum annotations file size (all CSV file sizes combined)	5 MB
Document corpus size (training and test documents)	10 GB
File sizes for training and testing files
Image file size (JPG, PNG, TIFF).	1 byte–10 MB. TIFF files: one page maximum.
Page size for PDF documents	1 byte–10 MB
Page size for Word documents	1 byte–10 MB
Amazon Textract API output JSON size	1 byte–1 MB

Real-time (synchronous) analysis

This section describes quotas related to real-time classification of semi-structured documents.

The following table shows the maximum file sizes for input documents. For all input document types, the input file maximum is one page, with no more than 10,000 characters.

File type	Maximum size (API)	Maximum size (console)
UTF-8 text documents	10 KB	10 KB
PDF documents	10 MB	5 MB
Word documents	10 MB	5 MB
Image files	10 MB	5 MB
Amazon Textract API output size	1 MB	n/a

Asynchronous analysis

The following table describes quotas related to asynchronous classification of semi-structured documents.

Description	Quota/Guideline
Maximum number of pages across all input documents for a job	25,000
Document corpus size	25 GB
Image file size (JPG, PNG, or TIFF)	1 byte–10 MB. TIFF files: one page maximum.
Page size for PDF documents	1 byte–10 MB
Page size for Word documents	1 byte–10 MB
Textract API output JSON size	1 byte–1 MB.

Custom entity recognition

This section describes the guidelines and quotas for the following operations for custom entity recognition:

Entity recognizer training jobs started with the CreateEntityRecognizer operation.
Asynchronous entity recognition jobs started with the StartEntitiesDetectionJob operation.
Synchronous entity recognition requests using the DetectEntities operation.

Custom entity recognition for plain text documents

Amazon Comprehend provides async and sync operations to analyze plain text documents with a custom entity recognizer.

Training

This section describes quotas related to training a custom entity recognizer to analyze plain text documents. To train the model, you can provide an entity list or a set of annotated text documents.

The following table describes quotas related to training the model with an entity list.

Description	Quota/Guideline
Number of entities per model	1–25
Document size (UTF-8)	1–5,000 byte
Number of items in entity list	1–1 million
Length of individual entry (post-strip) in entry list	1–5,000
Entity list corpus size (all docs in plaintext combined)	5 KB –200 MB

The following table describes quotas related to training the model with annotated text documents.

Description	Quota/Guideline
Number of entities per model/custom entity recognizer	1–25
Document size (UTF-8)	1–5,000 byte
Number of documents (see Plain-text annotations)	3–200,000
Document corpus size (all docs in plaintext combined)	5 KB - 200 MB
Minimum number of annotations per entity	25

Real-time (synchronous) analysis

The following table describes quotas related to real-time analysis of plain text documents.

Description	Quota/Guideline
Maximum number of documents per synchronous request	1
Maximum text document size (UTF-8 encoded)	5 KB

Asynchronous analysis

The following table describes quotas related to asynchronous entity recognition of plain text documents.

Description	Quota/Guideline
Document size (UTF-8)	1 byte–1 MB
Maximum number of files, one document per file	1,000,000
Maximum total number of lines, one document per line (for all files in request)	1,000,000
Document corpus size (all docs in plaintext combined)	1 byte–5 GB

Custom entity recognition for semi-structured documents

Amazon Comprehend provides async and sync operations to analyze semi-structured documents with a custom entity recognizer. You must train the model using annotated PDF documents.

Training

The following table describes quotas related to training a custom entity recognizer (CreateEntityRecognizer) to analyze semi-structured documents.

Description	Quota/Guideline
Number of entities per model/custom entity recognizer	1–25
Maximum annotation file size (UTF-8 JSON)	5 MB
Number of documents	250–10,000
Document corpus size (all docs in plaintext combined)	5 KB–1 GB
Minimum number of annotations per entity	100
Maximum number of augmented manifest files for training a custom entity recognizer	5
Maximum number of attribute names for each augmented manifest file	5
Maximum length of attribute name	63 characters

Real-time (synchronous) analysis

This section describes quotas related to real-time analysis of semi-structured documents.

The following table shows the maximum file sizes for input documents. For all input document types, the input file maximum is one page, with no more than 10,000 characters.

File type	Maximum size (API)	Maximum size (console)
UTF-8 text documents	10 KB	10 KB
PDF documents	10 MB	5 MB
Word documents	10 MB	5 MB
Image files	10 MB	5 MB
Textract output files	1 MB	n/a

Asynchronous analysis

This section describes quotas for asynchronous analysis of semi-structured documents.

Description	Quota/Guideline
Image size (JPG or PNG)	1 byte–10 MB
Image size (TIFF)	1 byte–10 MB. Maximum one page.
Document size (PDF)	1 byte–50 MB
Document size (Docx)	1 byte–5 MB
Document size (UTF-8)	1 byte–1 MB
Maximum number of files, one document per file (one document per line not allowed for image files or PDF/Word documents)	500
Maximum number of pages for a PDF or Docx file	100
Document corpus size after text extraction (plaintext, all files combined)	1 byte–5 GB

For more information about limits for images, see Hard Limits in Amazon Textract

Quotas for flywheels

Use flywheels to manage training and tracking of custom model versions for custom classification and custom entity recognition. For more information about Flywheels, see Flywheels.

General quotas for flywheels

The follow quotas apply to flywheels and flywheel iterations.

Description	Quota/Guideline
Maximum number of flywheels	50
Maximum number of flywheels in CREATING state	10
Maximum number of training datasets per flywheel	50
Maximum number of test datasets per flywheel	50
Maximum number of datasets with INGESTING status	10
Maximum number of in-progress flywheel iterations per account	10

Dataset quotas for custom classification models

When you ingest a dataset for a flywheel associated with a custom classification model, the following quotas apply.

Description	Quota/Guideline
Minimum number of training documents per class (multi-label mode)	50
Maximum number of training documents	1,000,000
Minimum dataset size	500 bytes
Maximum dataset size	5 GB
Maximum file size for one file, one document per file	10 MB

Dataset quotas for custom entity recognition models

When you ingest a dataset for a flywheel associated with a custom entity recognition model, the following quotas apply.

Description	Quota/Guideline
Maximum document size	5 KB
Minimum number of training documents	3
Maximum number of training documents	200,000
Minimum number of annotations per entity	25
Maximum dataset size	200 MB

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Infrastructure security

Tutorials