What is Amazon Transcribe? - Amazon Transcribe

What is Amazon Transcribe?

Amazon Transcribe is an automatic speech recognition service that uses machine learning models to convert audio to text. You can use Amazon Transcribe as a standalone transcription service or to add speech-to-text capabilities to any application.

With Amazon Transcribe, you can improve accuracy for your specific use case with language customization, filter content to ensure customer privacy or audience-appropriate language, analyze content in multi-channel audio, partition the speech of individual speakers, and more.

To view a complete list of features, see Amazon Transcribe features.

You can transcribe streaming media in real time or you can upload and transcribe media files. To see which languages are supported for each type of transcription, refer to the Supported languages and language-specific features table.

Important

Amazon Transcribe is covered under AWS’s HIPAA eligibility and BAA which requires BAA customers to encrypt all PHI at rest and in transit when in use. Automatic PHI identification is available at no additional charge and in all regions where Amazon Transcribe operates. For more information, refer to HIPAA eligibility and BAA.

For a short video tour of Amazon Transcribe, see:

To learn more, see How Amazon Transcribe works and Getting started with Amazon Transcribe.

Tip

Information on the Amazon Transcribe API is located in the API Reference.

Amazon Transcribe use cases

Amazon Transcribe is a robust speech-to-text service that offers a diverse array of features, many of which can be combined between Amazon Transcribe and other AWS services.

  • Gain insight into agent-customer calls using Call Analytics. This feature automatically analyzes 11 different criteria without any customization on your part. For each speaker, you get sentiment data, talk time, non-talk time, loudness, interruptions, and talk speed. Call summarization, call categorization, and turn-by-turn output are provided for the whole call.

    We also have two analytics options for call center audio: post-call analytics (designed for audio files located in an Amazon S3 bucket) and real-time analytics (designed for live audio streams).

  • Get a summary of customer-agent interactions with call summarization, which provides an at-a-glance summary of issues, action items, and outcomes for every call.

  • Teach Amazon Transcribe industry-specific terms, unique spelling, acronyms, and any words that are not being rendered correctly in your transcription results using custom vocabularies. Providing Amazon Transcribe with custom vocabularies can improve the accuracy of your transcription output. See also: Custom language models.

  • Create subtitles for your video files. You can also use content redaction (only in US English) and vocabulary filtering when generating subtitles to ensure your content is audience-appropriate. Note that filtered or redacted content shows as white space, ***, or [PII] in your transcript and subtitle files, but the audio itself is not altered.

  • Redact personally identifiable information (PII), such as social security numbers, from your transcripts using standard content redaction or Call Analytics sensitive data redaction. Call Analytics can also redact your audio file by replacing spoken PII with silence.

  • Partition individual speakers in an audio clip using speaker diarization. When you activate speaker diarization, Amazon Transcribe attaches a unique attribute to the text from each speaker in your transcription output.

  • Remove proprietary terms from your transcript using vocabulary filtering. For example, you can mask the name of a new product in a pre-launch stakeholder meeting. Vocabulary filtering can also be used to mask profane, offensive, or audience-inappropriate terms.

  • Using multi-channel audio, you can have Amazon Transcribe produce a separate transcript for each channel, or have all channels transcribed in one output file. See Transcribing multi-channel audio.

  • If your audio is not in a language you speak, let Amazon Transcribe identify the language for you using language identification. You can then use Amazon Translate to translate your transcript, and have Amazon Polly read your transcript back to you.

  • Improve streaming transcription accuracy with partial result stabilization, which can also be used to adjust the latency of your transcript.

Tip

For use case code examples, refer to the AWS Samples repository on GitHub.

Use cases in action

Here are some diverse examples of how individuals and organizations are using Amazon Transcribe.

Pricing

Amazon Transcribe is a pay-as-you-go service; pricing is based on seconds of transcribed audio, billed on a monthly basis. For more information on cost, including cost-breakdown examples for various AWS Regions, see Amazon Transcribe Pricing.