StartMedicalStreamTranscription - Transcribe

StartMedicalStreamTranscription

Starts a bidirectional HTTP/2 or WebSocket stream where audio is streamed to Amazon Transcribe Medical and the transcription results are streamed to your application.

For more information on streaming with Amazon Transcribe Medical, see Transcribing streaming audio.

Request Syntax

POST /medical-stream-transcription HTTP/2 x-amzn-transcribe-language-code: LanguageCode x-amzn-transcribe-sample-rate: MediaSampleRateHertz x-amzn-transcribe-media-encoding: MediaEncoding x-amzn-transcribe-vocabulary-name: VocabularyName x-amzn-transcribe-specialty: Specialty x-amzn-transcribe-type: Type x-amzn-transcribe-show-speaker-label: ShowSpeakerLabel x-amzn-transcribe-session-id: SessionId x-amzn-transcribe-enable-channel-identification: EnableChannelIdentification x-amzn-transcribe-number-of-channels: NumberOfChannels x-amzn-transcribe-content-identification-type: ContentIdentificationType Content-type: application/json { "AudioStream": { "AudioEvent": { "AudioChunk": blob } } }

URI Request Parameters

The request uses the following URI parameters.

ContentIdentificationType

Labels all personal health information (PHI) identified in your transcript.

Content identification is performed at the segment level; PHI is flagged upon complete transcription of an audio segment.

For more information, see Identifying personal health information (PHI) in a transcription.

Valid Values: PHI

EnableChannelIdentification

Enables channel identification in multi-channel audio.

Channel identification transcribes the audio on each channel independently, then appends the output for each channel into one transcript.

If you have multi-channel audio and do not enable channel identification, your audio is transcribed in a continuous manner and your transcript is not separated by channel.

You can't set ShowSpeakerLabel and EnableChannelIdentification in the same request. If you set both, your request returns a BadRequestException.

For more information, see Transcribing multi-channel audio.

LanguageCode

Specify the language code that represents the language spoken in your audio.

Important

Amazon Transcribe Medical only supports US English (en-US).

Valid Values: en-US | en-GB | es-US | fr-CA | fr-FR | en-AU | it-IT | de-DE | pt-BR | ja-JP | ko-KR | zh-CN

Required: Yes

MediaEncoding

Specify the encoding used for the input audio. Supported formats are:

  • FLAC

  • OPUS-encoded audio in an Ogg container

  • PCM (only signed 16-bit little-endian audio formats, which does not include WAV)

For more information, see Media formats.

Valid Values: pcm | ogg-opus | flac

Required: Yes

MediaSampleRateHertz

The sample rate of the input audio (in hertz). Amazon Transcribe Medical supports a range from 16,000 Hz to 48,000 Hz. Note that the sample rate you specify must match that of your audio.

Valid Range: Minimum value of 8000. Maximum value of 48000.

Required: Yes

NumberOfChannels

Specify the number of channels in your audio stream. Up to two channels are supported.

Valid Range: Minimum value of 2.

SessionId

Specify a name for your transcription session. If you don't include this parameter in your request, Amazon Transcribe Medical generates an ID and returns it in the response.

You can use a session ID to retry a streaming session.

Length Constraints: Fixed length of 36.

Pattern: [a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12}

ShowSpeakerLabel

Enables speaker identification (diarization) in your transcription output. Speaker identification labels the speech from individual speakers in your media file.

For more information, see Identifying speakers (diarization).

Specialty

Specify the medical specialty contained in your audio.

Valid Values: PRIMARYCARE | CARDIOLOGY | NEUROLOGY | ONCOLOGY | RADIOLOGY | UROLOGY

Required: Yes

Type

Specify the type of input audio. For example, choose DICTATION for a provider dictating patient notes and CONVERSATION for a dialogue between a patient and a medical professional.

Valid Values: CONVERSATION | DICTATION

Required: Yes

VocabularyName

Specify the name of the custom vocabulary that you want to use when processing your transcription. Note that vocabulary names are case sensitive.

Length Constraints: Minimum length of 1. Maximum length of 200.

Pattern: ^[0-9a-zA-Z._-]+

Request Body

The request accepts the following data in JSON format.

AudioStream

An encoded stream of audio blobs. Audio streams are encoded as either HTTP/2 or WebSocket data frames.

For more information, see Transcribing streaming audio.

Type: AudioStream object

Required: Yes

Response Syntax

HTTP/2 200 x-amzn-request-id: RequestId x-amzn-transcribe-language-code: LanguageCode x-amzn-transcribe-sample-rate: MediaSampleRateHertz x-amzn-transcribe-media-encoding: MediaEncoding x-amzn-transcribe-vocabulary-name: VocabularyName x-amzn-transcribe-specialty: Specialty x-amzn-transcribe-type: Type x-amzn-transcribe-show-speaker-label: ShowSpeakerLabel x-amzn-transcribe-session-id: SessionId x-amzn-transcribe-enable-channel-identification: EnableChannelIdentification x-amzn-transcribe-number-of-channels: NumberOfChannels x-amzn-transcribe-content-identification-type: ContentIdentificationType Content-type: application/json { "TranscriptResultStream": { "BadRequestException": { }, "ConflictException": { }, "InternalFailureException": { }, "LimitExceededException": { }, "ServiceUnavailableException": { }, "TranscriptEvent": { "Transcript": { "Results": [ { "Alternatives": [ { "Entities": [ { "Category": "string", "Confidence": number, "Content": "string", "EndTime": number, "StartTime": number } ], "Items": [ { "Confidence": number, "Content": "string", "EndTime": number, "Speaker": "string", "StartTime": number, "Type": "string" } ], "Transcript": "string" } ], "ChannelId": "string", "EndTime": number, "IsPartial": boolean, "ResultId": "string", "StartTime": number } ] } } } }

Response Elements

If the action is successful, the service sends back an HTTP 200 response.

The response returns the following HTTP headers.

ContentIdentificationType

Shows whether content identification was enabled for your transcription.

Valid Values: PHI

EnableChannelIdentification

Shows whether channel identification was enabled for your transcription.

LanguageCode

Provides the language code that you specified in your request. This must be en-US.

Valid Values: en-US | en-GB | es-US | fr-CA | fr-FR | en-AU | it-IT | de-DE | pt-BR | ja-JP | ko-KR | zh-CN

MediaEncoding

Provides the media encoding you specified in your request.

Valid Values: pcm | ogg-opus | flac

MediaSampleRateHertz

Provides the sample rate that you specified in your request.

Valid Range: Minimum value of 8000. Maximum value of 48000.

NumberOfChannels

Provides the number of channels that you specified in your request.

Valid Range: Minimum value of 2.

RequestId

Provides the identifier for your streaming request.

SessionId

Provides the identifier for your transcription session.

Length Constraints: Fixed length of 36.

Pattern: [a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12}

ShowSpeakerLabel

Shows whether speaker identification was enabled for your transcription.

Specialty

Provides the medical specialty that you specified in your request.

Valid Values: PRIMARYCARE | CARDIOLOGY | NEUROLOGY | ONCOLOGY | RADIOLOGY | UROLOGY

Type

Provides the type of audio you specified in your request.

Valid Values: CONVERSATION | DICTATION

VocabularyName

Provides the name of the custom vocabulary that you specified in your request.

Length Constraints: Minimum length of 1. Maximum length of 200.

Pattern: ^[0-9a-zA-Z._-]+

The following data is returned in JSON format by the service.

TranscriptResultStream

Provides detailed information about your streaming session.

Type: MedicalTranscriptResultStream object

Errors

For information about the errors that are common to all actions, see Common Errors.

BadRequestException

One or more arguments to the StartStreamTranscription or StartMedicalStreamTranscription operation was not valid. For example, MediaEncoding or LanguageCode used not valid values. Check the specified parameters and try your request again.

HTTP Status Code: 400

ConflictException

A new stream started with the same session ID. The current stream has been terminated.

HTTP Status Code: 409

InternalFailureException

A problem occurred while processing the audio. Amazon Transcribe terminated processing.

HTTP Status Code: 500

LimitExceededException

Your client has exceeded one of the Amazon Transcribe limits. This is typically the audio length limit. Break your audio stream into smaller chunks and try your request again.

HTTP Status Code: 429

ServiceUnavailableException

The service is currently unavailable. Try your request later.

HTTP Status Code: 503

See Also

For more information about using this API in one of the language-specific AWS SDKs, see the following: