Use CreateVocabulary with an AWS SDK or command line tool - AWS SDK Code Examples

There are more AWS SDK examples available in the AWS Doc SDK Examples GitHub repo.

Use CreateVocabulary with an AWS SDK or command line tool

The following code examples show how to use CreateVocabulary.

Action examples are code excerpts from larger programs and must be run in context. You can see this action in context in the following code example:

.NET
AWS SDK for .NET
Note

There's more on GitHub. Find the complete example and learn how to set up and run in the AWS Code Examples Repository.

/// <summary> /// Create a custom vocabulary using a list of phrases. Custom vocabularies /// improve transcription accuracy for one or more specific words. /// </summary> /// <param name="languageCode">The language code of the vocabulary.</param> /// <param name="phrases">Phrases to use in the vocabulary.</param> /// <param name="vocabularyName">Name for the vocabulary.</param> /// <returns>The state of the custom vocabulary.</returns> public async Task<VocabularyState> CreateCustomVocabulary(LanguageCode languageCode, List<string> phrases, string vocabularyName) { var response = await _amazonTranscribeService.CreateVocabularyAsync( new CreateVocabularyRequest { LanguageCode = languageCode, Phrases = phrases, VocabularyName = vocabularyName }); return response.VocabularyState; }
CLI
AWS CLI

To create a custom vocabulary

The following create-vocabulary example creates a custom vocabulary. To create a custom vocabulary, you must have created a text file with all the terms that you want to transcribe more accurately. For vocabulary-file-uri, specify the Amazon Simple Storage Service (Amazon S3) URI of that text file. For language-code, specify a language code corresponding to the language of your custom vocabulary. For vocabulary-name, specify what you want to call your custom vocabulary.

aws transcribe create-vocabulary \ --language-code language-code \ --vocabulary-name cli-vocab-example \ --vocabulary-file-uri s3://DOC-EXAMPLE-BUCKET/Amazon-S3-prefix/the-text-file-for-the-custom-vocabulary.txt

Output:

{ "VocabularyName": "cli-vocab-example", "LanguageCode": "language-code", "VocabularyState": "PENDING" }

For more information, see Custom Vocabularies in the Amazon Transcribe Developer Guide.

Python
SDK for Python (Boto3)
Note

There's more on GitHub. Find the complete example and learn how to set up and run in the AWS Code Examples Repository.

def create_vocabulary( vocabulary_name, language_code, transcribe_client, phrases=None, table_uri=None ): """ Creates a custom vocabulary that can be used to improve the accuracy of transcription jobs. This function returns as soon as the vocabulary processing is started. Call get_vocabulary to get the current status of the vocabulary. The vocabulary is ready to use when its status is 'READY'. :param vocabulary_name: The name of the custom vocabulary. :param language_code: The language code of the vocabulary. For example, en-US or nl-NL. :param transcribe_client: The Boto3 Transcribe client. :param phrases: A list of comma-separated phrases to include in the vocabulary. :param table_uri: A table of phrases and pronunciation hints to include in the vocabulary. :return: Information about the newly created vocabulary. """ try: vocab_args = {"VocabularyName": vocabulary_name, "LanguageCode": language_code} if phrases is not None: vocab_args["Phrases"] = phrases elif table_uri is not None: vocab_args["VocabularyFileUri"] = table_uri response = transcribe_client.create_vocabulary(**vocab_args) logger.info("Created custom vocabulary %s.", response["VocabularyName"]) except ClientError: logger.exception("Couldn't create custom vocabulary %s.", vocabulary_name) raise else: return response
  • For API details, see CreateVocabulary in AWS SDK for Python (Boto3) API Reference.