Transcribing with the AWS CLI
When using the AWS CLI to start a transcription, you can run all commands at the CLI level. Or you can run the command you want to use, followed by the AWS Region and the location of a JSON file that contains a request body. Examples throughout this guide show both methods; however, this section focuses on the former method.
The AWS CLI does not support streaming transcriptions.
Before you continue, make sure you've:
-
Uploaded your media file into an Amazon S3 bucket. If you're unsure how to create an Amazon S3 bucket or upload your file, refer to Create your first Amazon S3 bucket and Upload an object to your bucket.
-
Installed the AWS CLI.
You can find all AWS CLI commands for Amazon Transcribe in the
AWS CLI Command Reference
Starting a new transcription job
To start a new transcription, use the start-transcription-job
command.
-
In a terminal window, type the following:
aws transcribe start-transcription-job \
A '
>
' appears on the next line, and you can now continue adding required parameters, as described in the next step.You can also omit the '
\
' and append all parameters, separating each with a space. -
With the
start-transcription-job
command, you must includeregion
,transcription-job-name
,media
, and eitherlanguage-code
oridentify-language
.If you want to specify an output location, include
output-bucket-name
in your request; if you want to specify a sub-folder of the specified output bucket, also includeoutput-key
.aws transcribe start-transcription-job \ --region
us-west-2
\ --transcription-job-namemy-first-transcription-job
\ --media MediaFileUri=s3://DOC-EXAMPLE-BUCKET
/my-input-files
/my-media-file
.flac
\ --language-codeen-US
If appending all parameters, this request looks like:
aws transcribe start-transcription-job --region
us-west-2
--transcription-job-namemy-first-transcription-job
--media MediaFileUri=s3://DOC-EXAMPLE-BUCKET
/my-input-files
/my-media-file
.flac
--language-codeen-US
If you choose not to specify an output bucket using
output-bucket-name
, Amazon Transcribe places your transcription output in a service-managed bucket. Transcripts stored in a service-managed bucket expire after 90 days.Amazon Transcribe responds with:
{ "TranscriptionJob": { "TranscriptionJobName": "my-first-transcription-job", "TranscriptionJobStatus": "IN_PROGRESS", "LanguageCode": "en-US", "Media": { "MediaFileUri": "s3://DOC-EXAMPLE-BUCKET/my-input-files/my-media-file.flac" }, "StartTime": "2022-03-07T15:03:44.246000-08:00", "CreationTime": "2022-03-07T15:03:44.229000-08:00" } }
Your transcription job is successful if TranscriptionJobStatus
changes from IN_PROGRESS
to COMPLETED
. To see the updated TranscriptionJobStatus
, use the get-transcription-job
or list-transcription-job
command, as
shown in the following section.
Getting the status of a transcription job
To get information about your transcription job, use the get-transcription-job
command.
The only required parameters for this command are the AWS Region where the job is located and the name of the job.
aws transcribe get-transcription-job \ --region
us-west-2
\ --transcription-job-namemy-first-transcription-job
Amazon Transcribe responds with:
{ "TranscriptionJob": { "TranscriptionJobName": "my-first-transcription-job", "TranscriptionJobStatus": "COMPLETED", "LanguageCode": "en-US", "MediaSampleRateHertz": 48000, "MediaFormat": "flac", "Media": { "MediaFileUri": "s3://DOC-EXAMPLE-BUCKET/my-input-files/my-media-file.flac" }, "Transcript": { "TranscriptFileUri": "https://s3.the-URI-where-your-job-is-located.json" }, "StartTime": "2022-03-07T15:03:44.246000-08:00", "CreationTime": "2022-03-07T15:03:44.229000-08:00", "CompletionTime": "2022-03-07T15:04:01.158000-08:00", "Settings": { "ChannelIdentification": false, "ShowAlternatives": false } } }
If you've selected your own Amazon S3 bucket for your transcription output, this bucket is listed with
TranscriptFileUri
. If you've selected a service-managed bucket, a temporary URI
is provided; use this URI to download your transcript.
Note
Temporary URIs for service-managed Amazon S3 buckets are only valid for 15 minutes. If you get
an AccesDenied
error when using the URI, run the
get-transcription-job
request again to get a new temporary URI.
Listing your transcription jobs
To list all your transcription jobs in a given AWS Region, use the
list-transcription-jobs
command.
The only required parameter for this command is the AWS Region in which your transcription jobs are located.
aws transcribe list-transcription-jobs \ --region
us-west-2
Amazon Transcribe responds with:
{ "NextToken": "A-very-long-string", "TranscriptionJobSummaries": [ { "TranscriptionJobName": "my-first-transcription-job", "CreationTime": "2022-03-07T15:03:44.229000-08:00", "StartTime": "2022-03-07T15:03:44.246000-08:00", "CompletionTime": "2022-03-07T15:04:01.158000-08:00", "LanguageCode": "en-US", "TranscriptionJobStatus": "COMPLETED", "OutputLocationType": "SERVICE_BUCKET" } ] }
Deleting your transcription job
To delete your transcription job, use the delete-transcription-job
command.
The only required parameters for this command are the AWS Region where the job is located and the name of the job.
aws transcribe delete-transcription-job \ --region
us-west-2
\ --transcription-job-namemy-first-transcription-job
To confirm your delete request is successful, you can run the
list-transcription-jobs
command. Your job should no longer appear in
the list.