Request Syntax URI Request Parameters Request Body Response Syntax Response Elements Errors See Also

StartSpeechSynthesisStream

Synthesizes UTF-8 input, plain text, or SSML over a bidirectional streaming connection. Specify synthesis parameters in HTTP/2 headers, send text incrementally as events on the input stream, and receive synthesized audio as it becomes available.

This operation serves as a bidirectional counterpart to SynthesizeSpeech:

SynthesizeSpeech

Request Syntax


POST /v1/synthesisStream HTTP/1.1
x-amzn-Engine: Engine
x-amzn-LanguageCode: LanguageCode
x-amzn-LexiconNames: LexiconNames
x-amzn-OutputFormat: OutputFormat
x-amzn-SampleRate: SampleRate
x-amzn-VoiceId: VoiceId
Content-type: application/json

{
   "CloseStreamEvent": { 
   },
   "TextEvent": { 
      "FlushStreamConfiguration": { 
         "Force": boolean
      },
      "Text": "string",
      "TextType": "string"
   }
}

URI Request Parameters

The request uses the following URI parameters.

Engine

Specifies the engine for Amazon Polly to use when processing input text for speech synthesis. Currently, only the generative engine is supported. If you specify a voice that the selected engine doesn't support, Amazon Polly returns an error.

Valid Values: standard | neural | long-form | generative

Required: Yes

LanguageCode

An optional parameter that sets the language code for the speech synthesis request. Specify this parameter only when using a bilingual voice. If a bilingual voice is used and no language code is specified, Amazon Polly uses the default language of the bilingual voice.

LexiconNames

The names of one or more pronunciation lexicons for the service to apply during synthesis. Amazon Polly applies lexicons only when the lexicon language matches the voice language.

Array Members: Maximum number of 5 items.

Pattern: [0-9A-Za-z]{1,20}

OutputFormat

The audio format for the synthesized speech. Currently, Amazon Polly does not support JSON speech marks.

Valid Values: json | mp3 | ogg_opus | ogg_vorbis | pcm

Required: Yes

SampleRate

The audio frequency, specified in Hz.

VoiceId

The voice to use in synthesis. To get a list of available voice IDs, use the DescribeVoices operation.

Required: Yes

Request Body

The request accepts the following data in JSON format.

CloseStreamEvent

An event indicating the end of the input stream.

Type: CloseStreamEvent object

Required: No

TextEvent

A text event containing content to be synthesized.

Type: TextEvent object

Required: No

Response Syntax


HTTP/1.1 200
Content-type: application/json

{
   "AudioEvent": { 
      "AudioChunk": blob
   },
   "ServiceFailureException": { 
   },
   "ServiceQuotaExceededException": { 
   },
   "StreamClosedEvent": { 
      "RequestCharacters": number
   },
   "ThrottlingException": { 
   },
   "ValidationException": { 
   }
}