Amazon Transcribe
Developer Guide

The AWS Documentation website is getting a new look!
Try it now and let us know what you think. Switch to the new look >>

You can return to the original look by selecting English in the language selector above.

StartTranscriptionJob

Starts an asynchronous job to transcribe speech to text.

Request Syntax

{ "LanguageCode": "string", "Media": { "MediaFileUri": "string" }, "MediaFormat": "string", "MediaSampleRateHertz": number, "OutputBucketName": "string", "OutputEncryptionKMSKeyId": "string", "Settings": { "ChannelIdentification": boolean, "MaxSpeakerLabels": number, "ShowSpeakerLabels": boolean, "VocabularyName": "string" }, "TranscriptionJobName": "string" }

Request Parameters

For information about the parameters that are common to all actions, see Common Parameters.

The request accepts the following data in JSON format.

LanguageCode

The language code for the language used in the input media file.

Type: String

Valid Values: en-US | es-US | en-AU | fr-CA | en-GB | de-DE | pt-BR | fr-FR | it-IT | ko-KR | es-ES | en-IN | hi-IN | ar-SA | ru-RU | zh-CN

Required: Yes

Media

An object that describes the input media for a transcription job.

Type: Media object

Required: Yes

MediaFormat

The format of the input media file.

Type: String

Valid Values: mp3 | mp4 | wav | flac

Required: No

MediaSampleRateHertz

The sample rate, in Hertz, of the audio track in the input media file.

If you do not specify the media sample rate, Amazon Transcribe determines the sample rate. If you specify the sample rate, it must match the sample rate detected by Amazon Transcribe. In most cases, you should leave the MediaSampleRateHertz field blank and let Amazon Transcribe determine the sample rate.

Type: Integer

Valid Range: Minimum value of 8000. Maximum value of 48000.

Required: No

OutputBucketName

The location where the transcription is stored.

If you set the OutputBucketName, Amazon Transcribe puts the transcription in the specified S3 bucket. When you call the GetTranscriptionJob operation, the operation returns this location in the TranscriptFileUri field. The S3 bucket must have permissions that allow Amazon Transcribe to put files in the bucket. For more information, see Permissions Required for IAM User Roles.

You can specify an AWS Key Management Service (KMS) key to encrypt the output of your transcription using the OutputEncryptionKMSKeyId parameter. If you don't specify a KMS key, Amazon Transcribe uses the default Amazon S3 key for server-side encryption of transcripts that are placed in your S3 bucket.

If you don't set the OutputBucketName, Amazon Transcribe generates a pre-signed URL, a shareable URL that provides secure access to your transcription, and returns it in the TranscriptFileUri field. Use this URL to download the transcription.

Type: String

Length Constraints: Maximum length of 64.

Pattern: [a-z0-9][\.\-a-z0-9]{1,61}[a-z0-9]

Required: No

OutputEncryptionKMSKeyId

The Amazon Resource Name (ARN) of the AWS Key Management Service (KMS) key used to encrypt the output of the transcription job. The user calling the StartTranscriptionJob operation must have permission to use the specified KMS key.

You can use either of the following to identify a KMS key in the current account:

  • KMS Key ID: "1234abcd-12ab-34cd-56ef-1234567890ab"

  • KMS Key Alias: "alias/ExampleAlias"

You can use either of the following to identify a KMS key in the current account or another account:

  • Amazon Resource Name (ARN) of a KMS Key: "arn:aws:kms:region:account ID:key/1234abcd-12ab-34cd-56ef-1234567890ab"

  • ARN of a KMS Key Alias: "arn:aws:kms:region:account ID:alias/ExampleAlias"

If you don't specify an encryption key, the output of the transcription job is encrypted with the default Amazon S3 key (SSE-S3).

If you specify a KMS key to encrypt your output, you must also specify an output location in the OutputBucketName parameter.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 2048.

Pattern: ^[A-Za-z0-9][A-Za-z0-9:_/+=,@.-]{0,2048}$

Required: No

Settings

A Settings object that provides optional settings for a transcription job.

Type: Settings object

Required: No

TranscriptionJobName

The name of the job. Note that you can't use the strings "." or ".." by themselves as the job name. The name must also be unique within an AWS account.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 200.

Pattern: ^[0-9a-zA-Z._-]+

Required: Yes

Response Syntax

{ "TranscriptionJob": { "CompletionTime": number, "CreationTime": number, "FailureReason": "string", "LanguageCode": "string", "Media": { "MediaFileUri": "string" }, "MediaFormat": "string", "MediaSampleRateHertz": number, "Settings": { "ChannelIdentification": boolean, "MaxSpeakerLabels": number, "ShowSpeakerLabels": boolean, "VocabularyName": "string" }, "Transcript": { "TranscriptFileUri": "string" }, "TranscriptionJobName": "string", "TranscriptionJobStatus": "string" } }

Response Elements

If the action is successful, the service sends back an HTTP 200 response.

The following data is returned in JSON format by the service.

TranscriptionJob

An object containing details of the asynchronous transcription job.

Type: TranscriptionJob object

Errors

For information about the errors that are common to all actions, see Common Errors.

BadRequestException

Your request didn't pass one or more validation tests. For example, if the transcription you're trying to delete doesn't exist or if it is in a non-terminal state (for example, it's "in progress"). See the exception Message field for more information.

HTTP Status Code: 400

ConflictException

When you are using the CreateVocabulary operation, the JobName field is a duplicate of a previously entered job name. Resend your request with a different name.

When you are using the UpdateVocabulary operation, there are two jobs running at the same time. Resend the second request later.

HTTP Status Code: 400

InternalFailureException

There was an internal error. Check the error message and try your request again.

HTTP Status Code: 500

LimitExceededException

Either you have sent too many requests or your input file is too long. Wait before you resend your request, or use a smaller file and resend the request.

HTTP Status Code: 400

See Also

For more information about using this API in one of the language-specific AWS SDKs, see the following: