Understanding the AWS Glue data catalog tables - Amazon Chime SDK

Understanding the AWS Glue data catalog tables

The following tables list and describe the columns, data types, and elements in an Amazon Chime SDK call analytics Glue data catalog.

call_analytics_metadata

Column name

Data type

Elements

Definition

time

string

Event generation timestamp ISO 8601.

detail-type

string

Feature type related to service-type.

service-type

string

Name of the AWS service, VoiceAnalytics or CallAnalytics.

detail-subtype

string

Used for Recording and CallAnalyticsMetadata detail-types.

callevent-type

string

Event type associated with SIP, such as Update, Pause, Resume

mediaInsightsPipelineId

string

Amazon Chime SDK media insights pipeline ID.

metadata

string

voiceConnectorId

The Amazon Chime SDK Voice Connector ID.

callId

The call ID of the participant for the associated usage.

transactionId

The transaction ID of the call.

fromNumber

E.164 origination phone number.

toNumber

E.164 destination phone number.

direction

Direction of the call, Outbound or Inbound.

oneTimeMetadata.s3RecordingUrl

Amazon S3 bucket URL of the media object emitted by Transcribe Call Analytics.

oneTimeMetadata.s3RecordingUrlRedacted

Amazon S3 bucket URL of the redacted media object emitted by Transcribe Call Analytics.

oneTimeMetadata.siprecMetadata

SIPREC Metadata in XML format associated with the call.

oneTimeMetadata.siprecMetadataJson

SIPREC Metadata in JSON format associated with the call.

oneTimeMetadata.InviteHeaders

Invite headers.

call_analytics_recording_metadata

Column name

Data type

Elements

Definition

time

string

Event generation timestamp ISO 8601.

detail-type

string

Feature type related to service-type.

service-type

string

Name of the AWS service, VoiceAnalytics or CallAnalytics.

detail-subtype

string

Used for Recording and CallAnalyticsMetadata detail-types.

callevent-type

string

Event type associated with SIP

mediaInsightsPipelineId

string

Amazon Chime SDK media insight pipeline ID.

s3MediaObjectConsoleUrl

string

S3 Bucket URL of the media object.

metadata

string

voiceConnectorId

The Amazon Chime SDK Voice Connector ID.

callId

The call ID of the participant for the associated usage.

transactionId

The transaction ID of the call.

fromNumber

E.164 origination phone number.

toNumber

E.164 destination phone number.

direction

Direction of the call, Outbound or Inbound.

voice enhancement

Feature subtype related to service-type.

oneTimeMetadata.siprecMetadata

SIPREC Metadata in XML format associated with the call.

oneTimeMetadata.siprecMetadataJson

SIPREC Metadata in JSON format associated with the call.

oneTimeMetadata.InviteHeaders

Invite headers.

transcribe_call_analytics

Column name

Data type

Elements

Definition

time

string

Event generation timestamp ISO 8601.

detail-type

string

Feature type related to service-type.

service-type

string

Name of the AWS service, VoiceAnalytics or CallAnalytics.

mediaInsightsPipelineId

string

Amazon Chime SDK media insight pipeline ID.

metadata

string

voiceConnectorId

The Amazon Chime Voice Connector ID.

callId

The call ID of the participant for the associated usage.

transactionId

The transaction ID of the call.

fromNumber

E.164 origination phone number.

toNumber

E.164 destination phone number.

direction

Direction of the call, Outbound or Inbound.

UtteranceEvent

struct

UtteranceId

The unique identifier associated with the specified UtteranceEvent.

IsPartial

Indicates whether the segment in the UtteranceEvent is complete (FALSE) or partial (TRUE).

ParticipantRole

Provides the role of the speaker for each audio channel, either CUSTOMER or AGENT.

BeginOffsetMillis

The time, in milliseconds, from the beginning of the audio stream to the start of the UtteranceEvent.

EndOffsetMillis

The time, in milliseconds, from the beginning of the audio stream to the start of the UtteranceEvent.

Transcript

Contains transcribed text.

Sentiment

Provides the sentiment detected in the specified segment.

Items.beginoffsetmillis

The start time, in milliseconds, of the transcribed item.

Items.endoffsetmillis

The end time, in milliseconds, of the transcribed item.

Items.itemtype

The type of item identified. Options: PRONUNCIATION (spoken words) and PUNCTUATION.

Items.content

The word or punctuation that was transcribed.

Items.confidence

The confidence score associated with a word or phrase in your transcript. Scores are values between 0 and 1. A larger value indicates a higher probability that the identified item correctly matches the item spoken in your media.

Items.vocabularyfiltermatch

Indicates whether the specified item matches a word in the vocabulary filter included in your request. If true, there is a vocabulary filter match.

Items.stable

The partial result stabilization is enabled, Stable indicates whether the specified item is stable (true) or if it may change when the segment is complete (false).

IssuesDetected.characteroffsets_begin

Provides the character count of the first character where a match is identified. For example, the first character associated with an issue or a category match in a segment transcript.

IssuesDetected.characteroffsets_end

Provides the character count of the last character where a match is identified. For example, the last character associated with an issue or a category match in a segment transcript.

Entities.beginoffsetmillis

The start time, in milliseconds, of the utterance that was identified as PII.

Entities.endoffsetmillis

The end time, in milliseconds, of the utterance that was identified as PII.

Entities.category

The category of information identified. The only category is PII.

Entities.type

The type of PII identified. For example, NAME or CREDIT_DEBIT_NUMBER.

Entities.content

The word or words identified as PII.

Entities.confidence

The confidence score associated with the identified PII entity in your audio. Confidence scores range between 0 and 1. A larger value indicates a higher probability that the identified entity correctly matches the entity spoken in your media.

transcribe_call_analytics_category_events

Column name

Data type

Elements

Definition

time

string

Event generation timestamp ISO 8601.

detail-type

string

Feature type related to service-type.

service-type

string

Name of the AWS service, VoiceAnalytics or CallAnalytics.

mediaInsightsPipelineId

string

Amazon Chime SDK media insight pipeline ID.

metadata

string

voiceConnectorId

The Amazon Chime Voice Connector ID.

callId

The call ID of the participant for the associated usage.

transactionId

The transaction ID of the call.

fromNumber

E.164 origination phone number.

toNumber

E.164 destination phone number.

direction

Direction of the call, Outbound or Inbound.

CategoryEvent

array

MatchedCategories

Lists the matches in the categories defined by the user.

transcribe_call_analytics_post_call

Column name

Data type

Elements

Definition

JobStatus

string

Event generation timestamp ISO 8601.

LanguageCode

string

Feature type related to service-type.

Transcript

struct

LoudnessScores

Measures the volume at which each participant is speaking. Use this metric to see if the caller or the agent is speaking loudly or yelling, which often indicates anger.

This metric is represented as a normalized value (speech level per second of speech in a given segment) on a scale from 0 to 100, where a higher value indicates a louder voice.

Content

Contains transcribed text.

Id

The unique identifier associated with the specified UtteranceEvent.

BeginOffsetMillis

The time, in milliseconds, from the beginning of the audio stream to the start of the UtteranceEvent.

EndOffsetMillis

The time, in milliseconds, from the beginning of the audio stream to the start of the UtteranceEvent.

Sentiment

Provides the sentiment detected in the specified transcript segment.

ParticipantRole

Provides the role of the speaker for each audio channel, either CUSTOMER or AGENT.

IssuesDetected.CharacterOffsets.Begin

Provides the character offset to the first character where a match is identified. For example, the first character associated with an issue in a transcript segment.

IssuesDetected.CharacterOffsets.End

Provides the character offset to the last character where a match is identified. For example, the last character associated with an issue in a transcript segment.

OutcomesDetected.CharacterOffsets.Begin

Provides the outcome, or resolution, identified in the call.

OutcomesDetected.CharacterOffsets.End

ActionItemsDetected.CharacterOffsets.Begin

Lists any action items identified in the call.

ActionItemsDetected.CharacterOffsets.End

AccountId

string

The AWS account Id

Categories

struct

MatchedCategories

Lists the matched categories.

MatchedDetails

Lists the time, in milliseconds, from the beginning of the audio stream to when the Match in the category was detected.

Channel

string

Channel

Indicates a Voice channel.

Participants

array

ParticipantRole

Provides the role of the speaker for each audio channel, CUSTOMER or AGENT.

ConversationCharacteristics

struct

NonTalkTime

Measures periods of time that do not contain speech. Use this metric to find long periods of silence, such as a customer on hold for an excessive amount of time.

Interruptions

Measures if and when one participant cuts off the other participant mid-sentence. Frequent interruptions may be associated with rudeness or anger, and could correlate to negative sentiment for one or both participants.

TotalConversationDurationMillis

Total length of the conversation.

Sentiment.OverallSentiment.AGENT

OverallSentiment label for the Agent.

Sentiment.OverallSentiment.CUSTOMER

OverallSentiment label for the Customer.

Sentiment.SentimentByPeriod.QUARTER.AGENT

Sentiment labels for each quarter for the Agent.

Sentiment.SentimentByPeriod.QUARTER.CUSTOMER

Sentiment labels for each quarter for the Customer.

TalkSpeed

Measures the speed at which both participants are speaking. Comprehension can be affected if one participant speaks too quickly. This metric is measured in words per minute.

TalkTime

Measures the amount of time (in milliseconds) each participant spoke during the call. Use this metric to help identify if one participant is dominating the call or if the dialogue is balanced.

SessionId

string

SessionId for the call

ContentMetadata

string

Field that labels raw vs. redacted content per the customer specified configuration.

transcribe

Column name

Data type

Elements

Definition

time

string

Event generation timestamp ISO 8601.

detail-type

string

Feature type related to service-type.

service-type

string

Name of the AWS service, VoiceAnalytics or CallAnalytics.

mediaInsightsPipelineId

string

Amazon Chime SDK media insight pipeline ID.

metadata

string

voiceConnectorId

The Amazon Chime Voice Connector ID.

callId

The call ID of the participant for the associated usage.

transactionId

The transaction ID of the call.

fromNumber

E.164 origination phone number.

toNumber

E.164 destination phone number.

direction

Direction of the call, Outbound or Inbound.

TranscriptEvent

struct

ResultId

The unique identifier for the Result.

StartTime

The start time, in milliseconds, of the Result.

EndTime

The end time, in milliseconds, of the Result.

IsPartial

Indicates whether the segment is complete. If IsPartial is true, the segment is not complete. Otherwise, the segment is complete.

ChannelId

The ID of the channel associated with the audio stream.

Alternatives.Entities

Contains entities identified as personally identifiable information (PII) in your transcription output.

Alternatives.Items.Confidence

The confidence score associated with a word or phrase in your transcript. Confidence scores are values between 0 and 1. A larger value indicates a higher probability that the identified item correctly matches the item spoken in your media.

Alternatives.Items.Content

The transcribed word or punctuation mark.

Alternatives.Items.EndTime

The end time, in milliseconds, of the transcribed item.

Alternatives.Items.Speaker

If speaker partitioning is enabled, Speaker labels the speaker of the specified item.

Alternatives.Items.Stable

If partial result stabilization is enabled. Stable indicates whether the specified item is stable (true) or if it may change when the segment is complete (false).

Alternatives.Items.StartTime

The start time, in milliseconds, of the transcribed item.

Alternatives.Items.Type

The type of item identified. Options: PRONUNCIATION (spoken words) and PUNCTUATION.

Alternatives.Items.VocabularyFilterMatch

Indicates whether the specified item matches a word in the vocabulary filter included in your request. If true, there is a vocabulary filter match.

Alternatives.Transcript

Contains transcribed text.

voice_analytics_status

Column name

Data type

Elements

Definition

time

string

Event generation timestamp ISO 8601.

detail-type

string

Feature type related to service-type.

service-type

string

Name of the AWS service, VoiceAnalytics or CallAnalytics.

source

string

AWS service that produces the event.

account

string

AWS Account ID.

region

string

AWS Account Region.

version

string

Version of the event schema.

id

string

Unique ID of the event

detail

struct

taskId

Unique ID of the task.

isCaller

Indicates whether the participant is caller or not.

streamStartTime

Start time of the stream.

transactionId

The transaction ID of the call.

voiceConnectorId

The Amazon Chime Voice Connector ID.

callId

The call ID of the participant for the associated usage.

detailStatus

Detailed feature type related to service-type.

statusMessage

Status of task ID success or failure.

mediaInsightsPipelineId

Amazon Chime SDK media insight pipeline ID. This field is populated only for speaker search tasks started through the Media Pipelines SDK, not the Voice SDK.

sourceArn

The resource ARN for which the task is run on

streamArn

The Kinesis Video Stream ARN that the task is run for. This field is populated only for speaker search tasks started through the Media Pipelines SDK, not the Voice SDK.

channelId

The channel of the streamArn that the task is run for. This field is populated only for speaker search tasks started through the Media Pipelines SDK, not the Voice SDK.

speakerSearchDetails.voiceProfileId

ID of a voice profile enrolled whose voice embedding matches closely with the speaker in the call.

speakerSearchDetails.confidenceScore

Number between [0, 1] where a larger number means the machine learning model is more confident about the voice profile match.

speaker_search_status

Column name

Data type

Elements

Definition

time

string

Event generation timestamp ISO 8601.

detail-type

string

Feature type related to service-type.

service-type

string

Name of the AWS service, VoiceAnalytics or CallAnalytics.

source

string

AWS service that produces the event.

account

string

AWS Account ID.

region

string

AWS Account Region.

version

string

Version of the event schema.

id

string

Unique ID of the event

detail

struct

taskId

Unique ID of the task.

isCaller

Indicates whether the participant is caller or not.

transactionId

The transaction ID of the call. This field is populated if the task originates from a call made through a Voice Connector.

voiceConnectorId

The Amazon Chime Voice Connector ID. This field is populated if the task originates from a call made through a Voice Connector.

mediaInsightsPipelineId The media insights pipeline ID. This field is populated only for speaker search tasks started through the Media Pipelines SDK, not the Voice SDK.
sourceArn The resource ARN for which the task is run on.
streamArn The Kinesis Video Stream ARN that the task is run for. This field is populated only for speaker search tasks started through the Media Pipelines SDK, not the Voice SDK.
channelId The channel of the streamArn that the task is run for. This field is populated only for speaker search tasks started through the Media Pipelines SDK, not the Voice SDK.
participantRole The participant role associated with the channelId in the streamArn. This field is populated only for speaker search tasks started through the Media Pipelines SDK, not the Voice SDK.

detailStatus

Detailed feature type related to service-type.

statusMessage

Status of task ID, success or failure.

speakerSearchDetails.voiceProfileId

ID of a voice profile enrolled whose voice embedding matches closely with the speaker in the call.

speakerSearchDetails.confidenceScore

Number between [0, 1] where a larger number means the machine learning model is more confident about the voice profile match.

voice_tone_analysis_status

Column name

Data type

Elements

Definition

time

string

Event generation timestamp ISO 8601.

detail-type

string

Feature type related to service-type.

service-type

string

Name of the AWS service, VoiceAnalytics or CallAnalytics.

source

string

AWS service that produces the event.

account

string

AWS Account ID.

region

string

AWS Account Region.

version

string

Version of the event schema.

id

string

Unique ID of the event

detail

struct

taskId

Unique ID of the task.

isCaller

Indicates whether the participant is caller or not.

transactionId

The transaction ID of the call. This field is populated if the task originates from a call made through a Voice Connector.

voiceConnectorId

The Amazon Chime Voice Connector ID. This field is populated if the task originates from a call made through a Voice Connector.

mediaInsightsPipelineId The media insights pipeline ID. This field is populated only for speaker search tasks started through the Media Pipelines SDK, not the Voice SDK.
sourceArn The resource ARN for which the task is run on.
streamArn The Kinesis Video Stream ARN that the task is run for. This field is populated only for speaker search tasks started through the Media Pipelines SDK, not the Voice SDK.
channelId The channel of the streamArn that the task is run for. This field is populated only for speaker search tasks started through the Media Pipelines SDK, not the Voice SDK.
participantRole The participant role associated with the channelId in the streamArn. This field is populated only for speaker search tasks started through the Media Pipelines SDK, not the Voice SDK.

statusMessage

Status of task ID success or failure.

voiceToneAnalysisDetails.startFragmentNumber Starting fragment number associated with the streamArn.

voiceToneAnalysisDetails.currentAverageVoiceTone.startTime

Starting timestamp in ISO8601 format for the speaker's call audio that the current average sentiment is based on.

voiceToneAnalysisDetails.currentAverageVoiceTone.endTime

Ending timestamp in ISO8601 format for the speaker's call audio that the current average sentiment is based on.

voiceToneAnalysisDetails.currentAverageVoiceTone.beginOffsetMillis Beginning offset in milliseconds from the starting fragment for the speaker's call audio that the current average sentiment is based on.
voiceToneAnalysisDetails.currentAverageVoiceTone.endOffsetMillis Ending offset in milliseconds from the starting fragment for the speaker's call audio that the current average sentiment is based on.

voiceToneAnalysisDetails.currentAverageVoiceTone.voiceToneScore.positive

Probabilistic likelihood between [0, 1] that the speaker's sentiment is positive.

voiceToneAnalysisDetails.currentAverageVoiceTone.voiceToneScore.negative

Probabilistic likelihood between [0, 1] that the speaker's sentiment is negative.

voiceToneAnalysisDetails.currentAverageVoiceTone.voiceToneScore.neutral

Probabilistic likelihood between [0, 1] that the speaker's sentiment is neutral.

voiceToneAnalysisDetails.currentAverageVoiceTone.voiceToneLabel

Label with highest probability for the average voice tone score.

voiceToneAnalysisDetails.overallAverageVoiceTone.startTime

Starting timestamp in ISO8601 format for the speaker's call audio that the overall average sentiment is based on.

voiceToneAnalysisDetails.overallAverageVoiceTone.endTime

Ending timestamp in ISO8601 format for the speaker's call audio that the overall average sentiment is based on.

voiceToneAnalysisDetails.overallAverageVoiceTone.beginOffsetMillis Beginning offset in milliseconds from the starting fragment for the speaker's call audio that the overall average sentiment is based on.
voiceToneAnalysisDetails.overallAverageVoiceTone.endOffsetMillis Ending offset in milliseconds from the starting fragment for the speaker's call audio that the overall average sentiment is based on.

voiceToneAnalysisDetails.overallAverageVoiceTone.voiceToneScore.positive

Probabilistic likelihood between [0, 1] that the speaker's sentiment is positive.

voiceToneAnalysisDetails.overallAverageVoiceTone.voiceToneScore.negative

Probabilistic likelihood between [0, 1] that the speaker's sentiment is negative.

voiceToneAnalysisDetails.overallAverageVoiceTone.voiceToneScore.neutral

Probabilistic likelihood between [0, 1] that the speaker's sentiment is neutral.

voiceToneAnalysisDetails.overallAverageVoiceTone.voiceToneLabel

Sentiment label (positive, negative, or neutral) with the highest sentiment score.