Quotas in Amazon Polly - Amazon Polly

Quotas in Amazon Polly

The following are limits to be aware of when using Amazon Polly.

Supported Regions

For a list of AWS Regions where Amazon Polly is available, see Amazon Polly Endpoints and Quotas in the Amazon Web Services General Reference.

Neural voices aren't available in all AWS Regions. For the Regions that support neural voices, see Feature and Region Compatibility for neural TTS.

Throttling

  • Throttle rate per account: 100 transactions (requests or operations) per second (tps) with a burst limit of 120 tps.

    Concurrent connections per account: 90

  • Throttle rate per operation:

    Operation

    Limit

    Lexicon

    DeleteLexicon

    PutLexicon

    GetLexicon

    ListLexicons

    Any 2 transactions per second (tps) from these operations combined.

    Maximum allowed burst of 4 tps.

    Speech

    DescribeVoices

    80 tps with a burst limit of 100 tps

    SynthesizeSpeech

    Standard voice: 80 tps with a burst limit of 100 tps

    Neural voice: 8 tps with a burst limit of 10 tps

    StartSpeechSynthesisTask

    Standard voice: 10 tps with a burst limit of 12 tps

    Neural voice: 1 tps

    GetSynthesizeSpeechTask and ListSynthesizeSpeechTask

    Maximum allowed 10 tps combined

Pronunciation Lexicons

  • You can store up to 100 lexicons per account.

  • Lexicon names can be an alphanumeric string up to 20 characters long.

  • Each lexicon can be up to 4,000 characters in size. (Note that the size of the lexicon affects the latency of the SynthesizeSpeech operation.)

  • You can specify up to 100 characters for each <phoneme> or <alias> replacement in a lexicon.

For information about using lexicons, see Managing Lexicons.

SynthesizeSpeech API Operation

Note the following limits related to using the SynthesizeSpeech API operation:

  • The size of the input text can be up to 3000 billed characters (6000 total characters). SSML tags are not counted as billed characters.

  • You can specify up to five lexicons to apply to the input text.

  • The output audio stream (synthesis) is limited to 10 minutes. After this is reached, any remaining speech is cut off.

For more information, see SynthesizeSpeech.

Note

Some limitations of the SynthesizeSpeech API operation can be bypassed using the StartSythensizeSpeechTask API operation. For more information, see Creating Long Audio Files.

SpeechSynthesisTask API Operations

Note the following limit relating to using the StartSpeechSynthesisTask, GetSpeechSynthesisTask, and ListSpeechSynthesisTasks API operations:

  • The size of the input text can be up to 100,000 billed characters (200,000 total characters). SSML tags are not counted as billed characters.

  • You can specify up to five lexicons to apply to the input text.

Speech Synthesis Markup Language (SSML)

Note the following limits related to using SSML:

  • The <audio>, <lexicon>, <lookup>, and <voice> tags are not supported.

  • <break> elements can specify a maximum duration of 10 seconds each.

  • The <prosody> tag doesn't support values for the rate attribute lower than -80%.

For more information, see Generating Speech from SSML Documents.