How Amazon Transcribe works

Amazon Transcribe uses machine learning models to convert speech to text.

In addition to the transcribed text, transcripts contains data about the transcribed content, including confidence scores and timestamps for each word or punctuation mark. To see an output example, refer to the Data input and output section. For a complete list of features that you can apply to your transcription, refer to the feature summary.

Transcription methods can be separated into two main categories:

Batch transcriptions: Transcribe media files that have been uploaded into an Amazon S3 bucket. You can use the AWS CLI, AWS Management Console, and various AWS SDKs for batch transcriptions.
Streaming transcriptions: Transcribe media streams in real time. You can use the AWS Management Console, HTTP/2, WebSockets, and various AWS SDKs for streaming transcriptions.

Note that feature and language support differs for batch and streaming transcriptions. For more information, refer to Amazon Transcribe features and Supported languages.

Topics

API operations to get you started

Batch: StartTranscriptionJob

Streaming: StartStreamTranscription, StartStreamTranscriptionWebSocket

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Character sets

Data input and output