Jump to Content

This API Documentation is now deprecated

We are excited to announce our new API Documentation.

A word, phrase, or punctuation mark in your transcription output, along with various associated attributes, such as confidence score, type, and start and end times.


  • Item


Confidence?: number

The confidence score associated with a word or phrase in your transcript.

Confidence scores are values between 0 and 1. A larger value indicates a higher probability that the identified item correctly matches the item spoken in your media.

Content?: string

The word or punctuation that was transcribed.

EndTime?: number

The end time, in milliseconds, of the transcribed item.

Speaker?: string

If speaker partitioning is enabled, Speaker labels the speaker of the specified item.

Stable?: boolean

If partial result stabilization is enabled, Stable indicates whether the specified item is stable (true) or if it may change when the segment is complete (false).

StartTime?: number

The start time, in milliseconds, of the transcribed item.

Type?: string

The type of item identified. Options are: PRONUNCIATION (spoken words) and PUNCTUATION.

VocabularyFilterMatch?: boolean

Indicates whether the specified item matches a word in the vocabulary filter included in your request. If true, there is a vocabulary filter match.