Amazon Transcribe
Developer Guide

CreateVocabulary

Creates a new custom vocabulary that you can use to change the way Amazon Transcribe handles transcription of an audio file.

Request Syntax

{ "LanguageCode": "string", "Phrases": [ "string" ], "VocabularyFileUri": "string", "VocabularyName": "string" }

Request Parameters

For information about the parameters that are common to all actions, see Common Parameters.

The request accepts the following data in JSON format.

LanguageCode

The language code of the vocabulary entries.

Type: String

Valid Values: en-US | es-US | en-AU | fr-CA | en-GB | de-DE | pt-BR | fr-FR | it-IT | ko-KR

Required: Yes

Phrases

An array of strings that contains the vocabulary entries.

Type: Array of strings

Length Constraints: Minimum length of 0. Maximum length of 256.

Required: No

VocabularyFileUri

The S3 location of the text file that contains the definition of the custom vocabulary. The URI must be in the same region as the API endpoint that you are calling. The general form is

https://s3-<aws-region>.amazonaws.com/<bucket-name>/<keyprefix>/<objectkey>

For example:

https://s3-us-east-1.amazonaws.com/examplebucket/vocab.txt

For more information about S3 object names, see Object Keys in the Amazon S3 Developer Guide.

For more information about custom vocabularies, see Custom Vocabularies.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 2000.

Required: No

VocabularyName

The name of the vocabulary. The name must be unique within an AWS account. The name is case-sensitive.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 200.

Pattern: ^[0-9a-zA-Z._-]+

Required: Yes

Response Syntax

{ "FailureReason": "string", "LanguageCode": "string", "LastModifiedTime": number, "VocabularyName": "string", "VocabularyState": "string" }

Response Elements

If the action is successful, the service sends back an HTTP 200 response.

The following data is returned in JSON format by the service.

FailureReason

If the VocabularyState field is FAILED, this field contains information about why the job failed.

Type: String

LanguageCode

The language code of the vocabulary entries.

Type: String

Valid Values: en-US | es-US | en-AU | fr-CA | en-GB | de-DE | pt-BR | fr-FR | it-IT | ko-KR

LastModifiedTime

The date and time that the vocabulary was created.

Type: Timestamp

VocabularyName

The name of the vocabulary.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 200.

Pattern: ^[0-9a-zA-Z._-]+

VocabularyState

The processing state of the vocabulary. When the VocabularyState field contains READY the vocabulary is ready to be used in a StartTranscriptionJob request.

Type: String

Valid Values: PENDING | READY | FAILED

Errors

For information about the errors that are common to all actions, see Common Errors.

BadRequestException

Your request didn't pass one or more validation tests. For example, if the transcription you're trying to delete doesn't exist or if it is in a non-terminal state (for example, it's "in progress"). See the exception Message field for more information.

HTTP Status Code: 400

ConflictException

When you are using the StartTranscriptionJob operation, the JobName field is a duplicate of a previously entered job name. Resend your request with a different name.

When you are using the UpdateVocabulary operation, there are two jobs running at the same time. Resend the second request later.

HTTP Status Code: 400

InternalFailureException

There was an internal error. Check the error message and try your request again.

HTTP Status Code: 500

LimitExceededException

Either you have sent too many requests or your input file is too long. Wait before you resend your request, or use a smaller file and resend the request.

HTTP Status Code: 400

See Also

For more information about using this API in one of the language-specific AWS SDKs, see the following: