Using Amazon Chime SDK live transcription
You use Amazon Chime SDK live transcription to generate live, user-attributed transcripts of your meetings. Amazon Chime SDK live transcription integrates with the Amazon Transcribe and Amazon Transcribe Medical services to generate transcripts of Amazon Chime SDK meetings while they're in progress.
Amazon Chime SDK live transcription processes each user’s audio separately for improved accuracy in multi-speaker scenarios. The Amazon Chime SDK uses its active talker algorithm to select the top two active talkers, and then sends their audio to Amazon Transcribe, in separate channels, via a single stream. Meeting participants receive user-attributed transcriptions via Amazon Chime SDK data messages. You can use transcriptions in a variety of ways, such as displaying subtitles, creating meeting transcripts, or using the transcriptions for content analysis.
Live transcription uses one stream to Amazon Transcribe for the duration of the meeting transcription.
Standard Amazon Transcribe and Amazon Transcribe Medical costs apply. For more information, refer to Amazon Transcribe Pricing
Important
By default, Amazon Transcribe may use and store audio content processed by the service to develop
and improve AWS AI/ML services as further described in section 50 of the AWS Service Terms
Topics
- System architecture
- Billing and usage
- Configuring your account for Amazon Chime SDK live transcription
- Choosing Amazon Chime SDK live transcription options
- Starting and stopping Amazon Chime SDK live transcription
- Amazon Chime SDK live transcription parameters
- Understanding Amazon Chime SDK live transcription events
- Understanding Amazon Chime SDK live transcription messages
- Processing a received Amazon Chime SDK live transcript event
- Parsing Amazon Chime SDK transcripts
System architecture
The Amazon Chime SDK creates real-time meeting transcriptions, without audio leaving the AWS network, via a service-side integration with your Amazon Transcribe or Amazon Transcribe Medical account. For improved accuracy, users’ audio is processed separately, then mixed into the meeting. The Amazon Chime SDK uses its active talker algorithm to select the top two active talkers, and then sends their audio to Amazon Transcribe or Amazon Transcribe Medical in separate channels via a single stream. For reduced latency, user-attributed transcriptions are sent directly to every meeting participant via data messages. When using a media pipeline to capture meeting audio, the meeting’s transcription information is also captured.
Billing and usage
Live transcription uses one stream to Amazon Transcribe or Amazon Transcribe Medical for the duration of the
meeting transcription. Standard Amazon Transcribe and Amazon Transcribe Medical costs apply. For more
information, see Amazon Transcribe
Pricing
Amazon Chime SDK live transcription parameters
The Amazon Transcribe and Amazon Transcribe Medical APIs offer a number of parameters when initiating
streaming transcription, such as StartStreamTranscription and StartMedicalStreamTranscription. You can use t hose
parameters in the StartMeetingTranscription
API unless the Amazon Chime SDK
predetermines the parameter’s value. For example, the MediaEncoding
and
MediaSampleRateHertz
parameters are not available because the Amazon Chime SDK
sets them automatically.
Amazon Transcribe and Amazon Transcribe Medical validate the parameters, and that allows you to use new
parameter values as soon as they become available. For example, if Amazon Transcribe
Medical launches support for a new language, you only need to specify the new language
value in the LanguageCode
parameter.