Understanding workflows for machine-learning based analytics for the Amazon Chime SDK
The following sections describe how to use the machine-learning analytics features provide by Amazon Chime SDK call analytics.
Note
If you plan to run multiple machine-learning analytics on the same Kinesis
Video Stream, you may need to increase the connection-level limit for
GetMedia
and GetMediaForFragmentList
for the video
stream. For more information, refer to Kinesis Video Streams
limits in the Kinesis Video Streams Developer
Guide.
Use this workflow when:
-
You want console-driven setup.
-
You already use or plan to use a Voice Connector to bring SIP media into call analytics. Voice Connectors support SIP as well as SIPREC. For more information on configuring Voice Connectors, refer to Managing Amazon Chime SDK Voice Connector.
-
You want to apply the same media insights configuration to every Voice Connector call.
-
You need to use Amazon Chime SDK voice analytics, which requires a Voice Connector or a media insights pipeline.
To enable this workflow in the Amazon Chime SDK console, follow the steps for creating a recording configuration in Configuring Voice Connectors to use call analytics.
To enable this workflow programmatically, use the following APIs: CreateMediaInsightsPipelineConfiguration API to create a call analytics configuration and then associated the configuration to a Voice Connector using the PutVoiceConnectorStreamingConfiguration API. For more information, see Configuring Voice Connectors to use voice analytics in the Amazon Chime SDK Administrator Guide.
The following diagram shows the flow of data when a Voice Connector initiates a call analytics session. Numbers in the diagram correspond to the numbered text below.
In the diagram:
-
You use the Amazon Chime SDK console or the CreateMediaInsightsPipelineConfiguration API to the create a media insights pipeline configuration.
-
You use the Amazon Chime SDK console or the PutVoiceConnectorStreamingConfiguration API to associate the configuration with a Voice Connector. To associate an existing configuration with a Voice Connector, refer to Configuring Voice Connectors to use call analytics, in the Amazon Chime SDK Administrator Guide.
-
During an outgoing call, the Voice Connector receives each call participant's audio.
-
Because of built-in integration with call analytics, if a call analytics configuration is attached to a Voice Connector, the Voice Connector service initiate a call analytics session using the media pipeline service.
-
The media pipeline service invokes one or more media processors as specified in the configuration.
-
The media pipeline service sends the output data to one or more destinations based on the configuration. For example, you can send real-time analytics via an Amazon Kinesis Data Stream, and if configured, you can send the call metadata and analytics to an Amazon S3 data warehouse.
-
The media pipeline service sends the pipeline status events to the default Amazon EventBridge. If you have configured rules then the notifications for them will be sent to the Amazon EventBridge as well. For more information see, Using EventBridge notifications.
Note
-
A voice analytics processor only starts automatically when you call the StartSpeakerSearchTask or StartVoiceToneAnalyisTask APIs.
-
You must enable Voice Connector streaming to use call analytics with Voice Connector. This feature enables streaming of call data to Voice Connector managed Kinesis Video Streams in your account. For more information, refer to Streaming Amazon Chime SDK Voice Connector media to Kinesis Video Streams in the Amazon Chime SDK Administrator Guide.
You can store Voice Connector call data in Kinesis Video Streams for varying amounts of time, ranging from hours to years. Opting for no data retention limits the usability of the call data for immediate consumption. The cost of Kinesis Video Streams is determined based on the bandwidth and total storage utilized. It is possible to adjust the data retention period at any time by editing your Voice Connector's streaming configuration. To enable call analytics recording, you must ensure that the Kinesis Video Stream retains data until call analytics finishes. You do that by specifying a suitable data retention period.
You can associate a media insights pipeline configuration with as many Voice Connectors as you want. You can also create a different configuration for each Voice Connector. Voice Connectors use the AWSServiceRoleForAmazonChimeVoiceConnector to call the CreateMediaInsightsPipeline API on your behalf once per transaction ID. For information about the role, see Using the Amazon Chime SDK service-linked role for Amazon Chime SDK Voice Connectors in the Amazon Chime SDK Administrator Guide.
Use this workflow if you use a Voice Connector but need to control when you apply a call analytics configuration and which call to apply the configuration to.
To use this method, you need to create an EventBridge target for events that the Voice Connector publishes, and then use the events to trigger the call analytics pipeline APIs. For more information, see Automating the Amazon Chime SDK with EventBridge in the Amazon Chime SDK Administrator Guide.
The following diagram shows how to implement more granular control when using call analytics with Voice Connector. Numbers in the diagram correspond to numbers in the text below.
In the diagram:
-
You use the Amazon Chime SDK console or the CreateMediaInsightsPipelineConfiguration API to the create a media insights pipeline configuration.
-
During an outgoing call the Voice Connector will receive participant audio.
-
The Voice Connector sends call audio to Kinesis Video Stream and corresponding events to the EventBridge. These events have stream and call metadata.
-
Your application is subscribed to EventBridge via an EventBridge Target.
-
Your application invokes the Amazon Chime SDK CreateMediaInsightsPipeline API.
-
The media pipeline service invokes one or more media processors based on the processor elements in the media insights pipeline configuration.
-
The media pipeline service sends the output data to one or more destinations based on the configuration. Amazon Chime SDK call analytics will provide real-time analytics via Amazon Kinesis Data Stream and if configured call metadata analytics to an Amazon S3 data warehouse.
-
The media pipeline service sends the events to the Amazon EventBridge. If you have configured rules then the notifications for them will be sent to the Amazon EventBridge as well.
-
You can pause or resume the call analytics session by invoking the UpdateMediaInsightsPipelineStatus API.
Note
Call recording does not support pausing and resuming calls. Also, voice analytics tasks started for the call also stop when you pause a session. To restart them, you must call the StartSpeakerSearchTask or StartVoiceToneAnalyisTask APIs.
-
If you select voice tone analytics during configuration, you start voice analytics by calling the StartSpeakerSearchTask or StartVoiceToneAnalyisTask APIs.
To use this option, you need to publish audio data to Kinesis Video Streams (KVS) and then call the CreateMediaInsightsPipeline API with KVS stream channel information.
Note
The call analytics APIs support a maximum of two audio channels.
When calling the CreateMediaInsightsPipeline API, you can specify fragment numbers for each KVS stream channel definition. If you supply a fragment number, call analytics begins processing the stream at that fragment. Otherwise, call analytics begins processing the stream from the latest available fragment.
Call analytics supports PCM audio (only signed 16-bit little-endian audio formats, which does not include WAV) with an audio sample rate between 8kHz and 48kHz. Low-quality audio, such as telephony audio, is typically around 8,000 Hz. High-quality audio typically ranges from 16,000 Hz to 48,000 Hz. The sample rate that you specify must match that of your audio. For more information, see KinesisVideoStreamSourceRuntimeConfiguration in the Amazon Chime SDK API Reference.
The Kinesis Video Streams Producer SDK provides a set of libraries that you can use to stream audio data to a Kinesis Video Stream. For more information, refer to Kinesis Video Streams Producer Libraries, in the Amazon Kinesis Video Streams Developer Guide.
The following diagram shows the flow of data when using call analytics with a custom Kinesis Video Stream producer. Numbers in the diagram correspond to the numbered text below.
-
You use the AWS console or the CreateMediaInsightsPipelineConfiguration API to create a media insights pipeline configuration.
-
You use a Kinesis Video Stream Producer to write audio to Kinesis Video Streams.
-
Your application invokes the CreateMediaInsightsPipeline API.
-
The media pipeline service reads audio from the customer's Kinesis Video Streams.
-
The media pipeline service sends the events to the Amazon EventBridge. If you have configured rules then the notifications for them will be sent to the Amazon EventBridge as well.
-
The media pipeline service invokes one or more processor elements.
-
The media pipeline service sends output data to one or more sink elements.
-
You can pause or resume the call analytics session by invoking the UpdateMediaInsightsPipelineStatus API.
Note
Call recording does not support pause and resume.
-
Your application can process the Amazon EventBridge events to trigger custom business workflows.
-
If you select voice analytics when you create a configuration, your application can start voice analytics by calling the StartSpeakerSearchTask or StartVoiceToneAnalyisTask APIs.