Using voice APIs to run voice analytics - Amazon Chime SDK

Using voice APIs to run voice analytics

For backwards compatibility, you can use Amazon Chime SDK Voice APIs to start and manage voice analytics. However, only the media insights pipeline APIs for voice analytics provide new features, so we strongly recommend using them instead.

The following sections explain the differences between the voice and media insights pipelines APIs.

Stopping tasks

If you use a Voice Connector to start voice analytics tasks, and you then use the UpdateMediaInsightsPipelineStatus API to pause the pipeline, the tasks continue running. To stop the tasks, you must call the StopSpeakerSearchTask and StopVoiceToneAnalysisTask APIs.

Understanding the notification differences

When you use voice APIs to run voice analytics, the notifications differ from those generated by media insights pipelines.

  • Voice analytics ready events are only available for tasks started using voice APIs.

  • You need to use the voiceConnectorId, transactionId, or callId fields in your notifications to associate a voice analytics task with a call. If you use media insights pipelines to run voice analytics, you use the mediaInsightsPipelineId and streamArn or channelId fields to associate a task with a call.

The following topics explain how to use notifications with voice APIs.

Voice analytics ready events

Voice analytics ready events have the VoiceAnalyticsStatus detail type.

You use Amazon Chime SDK Voice Connectors to start analytics tasks. When your receive a voice analytics ready event, you can trigger a speaker search or voice tone analysis task for the call, identified by the following properties:

  • voiceConnectorId

  • transactionId

Note

This notification is provided only when you have a media insights pipeline configuration with voice analytics enabled and associated with a Voice Connector. This notification is NOT provided when customers call the CreateMediaInsightsPipeline API and launch a speaker search task or voice tone analysis task via the Media Pipelines SDK.

The SIP headers returned by a Voice Connector contain the transactionId. If you don't have access to the SIP headers, the AnalyticsReady notification event also contains the voiceConnectorId and transactionId. That allows you to programmatically receive the information and call the StartSpeakerSearchTask, or StartVoiceToneAnalysisTask APIs.

When voice analytics is ready for processing, the Voice Connector sends an event with "detailStatus": "AnalyticsReady" to the notification target as a JSON body. If you use Amazon SNS or Amazon SQS, that body appears in the “Records” field in the Amazon SNS or Amazon SQS payload.

The following example shows a typical JSON body.

{ "detail-type": "VoiceAnalyticsStatus", "version": "0", "id": "Id-f928dfe3-f44b-4965-8a17-612f9fb92d59", "source": "aws.chime", "account": "123456789012", "time": "2022-08-26T17:55:15.563441Z", "region": "us-east-1", "resources": [], "detail": { "detailStatus": "AnalyticsReady", "callDetails": { "isCaller": false, "transactionId": "daaeb6bf-2fe2-4e51-984e-d0fbf2f09436", "voiceConnectorId": "fuiopl1fsv9caobmqf2vy7" } } }

This notification allows you to trigger additional callbacks to your application, and to handle any legal requirements, such as notice and consent, prior to calling the voice analytics task APIs.

Speaker search events

Speaker search events have the SpeakerSearchStatus detail type.

Amazon Chime SDK Voice Connectors send the following speaker search events:

  • Identification matches

  • Voice embedding generation

The events can have the following statuses:

  • IdentificationSuccessful – Successfully identified at least one matching voice profile ID with a high confidence score in the given voice profile domain.

  • IdentificationFailure – Failed to perform identification. Causes: the caller doesn't talk for at least 10 seconds, poor audio quality.

  • IdentificationNoMatchesFound – Unable to find a high confidence match in the given voice profile domain. The caller may be new, or their voice may have changed.

  • VoiceprintGenerationSuccessful – The system generated a voice embedding using 20 seconds of non-silent audio.

  • VoiceprintGenerationFailure – The system failed to generate a voice embedding. Causes: caller doesn't talk for at least 20 seconds, poor audio quality.

Identification matches

After the StartSpeakerSearchTask API is called for a given transactionId, the Voice Connector service returns an identification match notification after 10 seconds of non-silent speech. The service returns the top 10 matches, along with a voice profile ID and confidence score ranging from [0, 1]. The higher the confidence score, the more likely the speaker from the call matches the voice profile ID. If the machine learning model finds no matches, the notification's detailStatus field contains IdentificationNoMatchesFound.

The following example shows notification for a successful match.

{ "version": "0", "id": "12345678-1234-1234-1234-111122223333", "detail-type": "SpeakerSearchStatus", "service-type": "VoiceAnalytics", "source": "aws.chime", "account": "111122223333", "time": "yyyy-mm-ddThh:mm:ssZ", "region": "us-east-1", "resources": [], "detail": { "taskId": "uuid", "detailStatus": "IdentificationSuccessful", "speakerSearchDetails" : { "results": [ { "voiceProfileId": "vp-505e0992-82da-49eb-9d4a-4b34772b96b6", "confidenceScore": "0.94567856", }, { "voiceProfileId": "vp-fba9cbfa-4b8d-4f10-9e41-9dfdd66545ab", "confidenceScore": "0.82783350", }, { "voiceProfileId": "vp-746995fd-16dc-45b9-8965-89569d1cf787", "confidenceScore": "0.77136436", } ] }, "isCaller": false, "voiceConnectorId": "abcdef1ghij2klmno3pqr4", "transactionId": "daaeb6bf-2fe2-4e51-984e-d0fbf2f09436" } }

Voice embedding generation

After an additional 10 seconds of non-silent speech, the Voice Connector sends a voice embedding generation notification to the notification targets. You can enroll new voice embeddings in a voice profile, or update a print already in a voice profile.

The following example shows the notification for a successful match, meaning you can update the associated voice profile.

{ "version": "0", "id": "12345678-1234-1234-1234-111122223333", "detail-type": "SpeakerSearchStatus", "service-type": "VoiceAnalytics", "source": "aws.chime", "account": "111122223333", "time": "yyyy-mm-ddThh:mm:ssZ", "region": "us-east-1", "resources": [], "detail": { "taskId": "guid", "detailStatus": "VoiceprintGenerationSuccess", "isCaller": false, "transactionId": "12345678-1234-1234", "voiceConnectorId": "abcdef1ghij2klmno3pqr" } }

Voice tone analysis events

Voice tone analysis events have the VoiceToneAnalysisStatus detail type. The analyses can return these statuses:

  • VoiceToneAnalysisSuccessful – Successfully analyzed the caller and agent voices into probabilities of sentiment—positive, negative, or neutral.

  • VoiceToneAnalysisFailure – Failed to perform tone analysis. This can happen if the caller hangs without talking for 10 seconds, or if the audio quality becomes too poor.

  • VoiceToneAnalysisCompleted – Successfully analyzed the user and agent voices into probabilities of sentiment for the entire call. This is the final event, sent when the voice tone analysis finishes.

The following example shows a typical voice tone analysis event.

{ "detail-type": "VoiceToneAnalysisStatus", "service-type": "VoiceAnalytics", "source": "aws.chime", "account": "216539279014", "time": "2022-08-26T17:55:15.563441Z", "region": "us-east-1", "detail": { "taskId": "uuid", "detailStatus": "VoiceToneAnalysisSuccessful", "voiceToneAnalysisDetails": { "currentAverageVoiceTone": { "startTime": "2022-08-26T17:55:15.563Z", "endTime": "2022-08-26T17:55:45.720Z", "voiceToneLabel": "neutral", "voiceToneScore": { "neutral": "0.83", "positive": "0.13", "negative": "0.04" } }, "overallAverageVoiceTone": { "startTime": "2022-08-26T16:23:13.344Z", "endTime": "2022-08-26T17:55:45.720Z", "voiceToneLabel": "positive", "voiceToneScore": { "neutral": "0.25", "positive": "0.65", "negative": "0.1" } } }, "isCaller": true, "transactionId": "daaeb6bf-2fe2-4e51-984e-d0fbf2f09436", "voiceConnectorId": "fuiopl1fsv9caobmqf2vy7" }, "version": "0", "id": "Id-f928dfe3-f44b-4965-8a17-612f9fb92d59" }