Using the Content Localization on AWS application - Content Localization on AWS

Using the Content Localization on AWS application

Access the web application

After the solution successfully launches, you can access the web application. The solution sends an email containing information to access the web application, including a temporary password. The first time you log in to the application, you will be prompted to change your password.

Identify the URL

Use the following procedure to identify the URL for the web application. This will allow you to sign in.

  1. Sign in to the AWS CloudFormation console and select the solution’s stack.

  2. On the Stacks page, select the WebStack nested stack.

  3. Choose the Outputs tab.

  4. Under the Key column, locate ContentLocalizationSolution, and select the corresponding value.

  5. Open the web application in a new tab or browser window.

  6. Sign in with your username (Admin email) and temporary password provided in the invitation email.

  7. After signing in, follow the prompts to create a new password.

Upload a video and run a workflow

Supported input types

This solution uses AWS Elemental MediaConvert to transcode uploaded videos into the MP4 format required by the analysis operators and therefore supports the same video input video formats. For information about the file formats supported by MediaConvert, refer to Supported input codecs and containers in the MediaConvert User Guide.

Run a workflow

Use the following procedure to upload videos and to start an analysis.

  1. Sign in to the Content Localization on AWS web application and choose Upload.

  2. From the Upload Content page, drag and drop one or more media files into the upload box shown in Figure 6.

    
              Content Localization on AWS web app Upload Content page.

    Content Localization on AWS Upload Content page

  3. Choose Configure Workflow and select at least one language from the Target Languages list.

  4. Select a source language for your input from the Source Language drop down.

  5. (Optional) Select an Amazon Transcribe Custom Vocabulary from the Custom Vocabulary drop down.

  6. (Optional) Select one or more Amazon Translate Custom Terminologies from the Custom Terminologies box.

  7. (Optional) Select one or more Amazon Translate Parallel Data sets from the Parallel Data box.

  8. (Optional) Turn on additional analysis on the input. By default, only the operators required to automatically generate and translate subtitles are turned on. You can optionally select other analysis operators to run in the workflow by selecting the check boxes next to each operator. Operator descriptions are included in Guide to analysis operators.

    
              Content Localization on AWS web app Operator categories page.

    Content Localization on AWS Operator categories

  9. To start the analyses, choose Upload and Run Workflow. After the workflow has completed, a table appears below the Operator categories to verify that the video was successfully uploaded and analyzed.

Work with the media collection 

When you run a workflow on a video through the web application, an Asset ID is created for that video and is used to identify all of the metadata and other outputs created for that video from running workflows. You can browse the collection of media assets in the Collection view.

Content Localization on AWS web app Media Collection page.

Content Localization on AWS Media Collection page

Browse the collection of processed media assets

Sign in to the Content Localization on AWS web application and choose Collection.

The Media Collection table lists the following attributes for each asset in the media collection:

  • Thumbnail - a frame sampled from the input video to help visually identify it.

  • File Name - the name of the input video file. 

  • Status - status of the most recent workflow run on this asset.

  • Asset ID - the Media Insights on AWS data plane asset ID assigned to this input. This asset ID is used to retrieve all of the metadata and other outputs that were created for this video by running Media Insights on AWS workflows.

  • Created - the date and time the asset was created.

  • Actions - buttons to perform further actions on the asset.

Search the media collection

The Media Collection page contains a search bar that can be used to filter the collection to those media assets that contain specific search terms. The workflow operators analyze the media file and indexes and then catalog the information in an Amazon OpenSearch Service instance. You can search all aspects of a media file using Apache Lucene, the Amazon OpenSearch Service query language. The following examples show common search patterns.

Use full-text queries

Full-text queries allow you to search for any type of data that exists in the video catalog. For example, the Amazon Rekognition celebrity detection service returns the full names of celebrities detected in a video. Figure 9 shows a search for a celebrity by first and last name. Figure 10 shows the search results.

Using the search function in the Content Localization on AWS web application.

Using the search function in the web application

Search results in the Content Localization on AWS web application.

Search results in the web application

Search high confidence data

After the analysis workflow completes, labels returned by Amazon Rekognition are assigned a confidence value. For details about Amazon Rekognition label detection metadata, refer to Detecting labels in a video. You can use that value to filter search results. For example, Violence AND Confidence:>80 will search for videos containing violence with an 80% or higher confidence threshold.

Search data from individual operators

Searches query the metadata catalog in Amazon OpenSearch Service. For example, a search for the term violence would match videos containing the violence label from content moderation and would also match video transcripts that contain the word violence. You can restrict your search to focus content moderation results with operator names, for example: Operator:content_moderation AND (Name:violence AND Confidence:>80).

You can use the following operator names to filter search queries:

  • label_detection

  • celebrity_detection

  • content_moderation

  • face_detection

  • transcribe

  • key_phrases

  • entities

  • webcaptions_<language-code>

You can also conduct compound searches using multiple operator names. For example, the following search query returns violence identified by content moderation and guns or weapons identified by label detection: (Operator:content_moderation AND Name:Violence AND Confidence:>80) OR (Operator:label_detection AND (Name:Gun OR Name:Weapon)).

View source-language subtitles

The default workflow for this application includes automatically generated WebVTT and SRT subtitles from the Amazon Transcribe output. This part of the workflow can’t be deactivated. Subtitles can be edited interactively in the application and saved to invoke reprocessing of downstream operators in the workflow using the updated subtitles as input.

To view subtitles in the application:

  1. Run a workflow on a video to create an asset.

  2. Choose Collection.

  3. Locate the media file you want to analyze and under the Actions column, choose Analyze.

  4. From the Speech Recognition tab, review the subtitles in the Subtitles tab for your content.

    When you select a subtitle, the video advances to the location of the subtitle.

View subtitles in the Content Localization on AWS web application.

View subtitles

To download WebVTT and SRT subtitles, choose Download Subtitles and select the format you want to download.

Edit source language subtitles

You can edit subtitles content, start time, and end time.

  1. After selecting your asset to analyze, from the Speech Recognition tab, choose the Subtitles tab.

  2. Select a subtitle. When you select a subtitle, the video advances to the location of the subtitle.

  3. Select the form box you want to edit for a specific subtitle (start time, end time, or subtitle content) and enter the new value.

  4. To save the changes you made to the subtitles, choose Save Edits at the bottom of the page. The solution saves the new subtitles to the Media Insights on AWS data plane (in WebCaptions format) and reprocesses any downstream operators for this asset that take the subtitles (WebCaptions) as an input.

Use corrections to source language subtitles to create an Amazon Transcribe custom vocabulary

You can use the corrections you make while editing subtitles to generate an Amazon Transcribe custom vocabulary that you can use in future workflows to improve the quality of Amazon Transcribe results for you content.

  1. After making edits in the Subtitles tab for your content, choose Save Custom Vocabulary at the bottom of the page.

  2. (Optional) To add to an existing vocabulary, use the radio button to select it before editing. This populates the vocabulary table in the form with the content of the existing vocabulary plus any edits you made to the subtitles.

  3. The form presents you with a table that is pre-populated with corrections that you made to the subtitles for the asset. You can further modify the rows in the table by selecting a cell and entering new values. You can also add and delete rows using the (+) and (-) buttons at the end of each row. The table contains the following values for each row:

    • Original phrase - the phrase that was in the automatically generated subtitles. This is provided for reference and can’t be edited in the form.

    • New phrase - the phrase that replaced the original phrase through the editor.

    • Sounds Like (optional) - The pronunciation of your word or phrase using the standard orthography of the language to mimic the way that the word sounds. For details, refer to Custom vocabularies in the Amazon Transcribe Developer Guide.

    • IPA (optional) - The pronunciation of your word or phrase using IPA characters. For details, refer to Custom vocabularies in the Amazon Transcribe Developer Guide.

    • Display As - Defines how you want the word or phrase looks when it's output. For details, refer to Custom vocabularies in the Amazon Transcribe Developer Guide.

    
            Save Vocabulary? dialog box in the Content Localization on AWS web
              application.

    Save Vocabulary? dialog box

  4. If you are creating a new vocabulary, fill in a name for the vocabulary in the Vocabulary Name field.

  5. Choose Save.

View target-language subtitles 

The default workflow for this application automatically generates WebVTT and SRT subtitles for all target languages specified by the application. This part of the workflow can’t be deactivated. Subtitles can be edited interactively in the application and saved to invoke reprocessing of downstream operators in the workflow using the updated subtitles as input.

  1. Run a workflow on a video to create an asset.

  2. Choose Collection.

  3. Locate the media file you want to analyze and under the Actions column, choose Analyze.

  4. From the Speech Recognition tab, review the translated subtitles in the Translation tab for your content.

  5. Select the language you want to view. When you select a subtitle, the video advances to the location of the subtitle.

Target-language translation in the Content Localization on AWS web application.

Target-language translation

Download target-language WebVTT and SRT subtitles

  1. After analyzing your asset, choose the Translation tab for your content.

  2. Select the language you want to work with.

  3. Choose Download and select the format you want to download.

Download target-language audio file

The content localization workflow creates a synthesized audio output using Amazon Polly for each target language. This audio output is intended to be used as audio-only and, therefore, does not preserve the timing with the video content. If Amazon Polly does not support a target-language, then the processing is skipped and the download button will be inactive for that language.

  1. After analyzing your asset, choose the Translation tab for your content.

  2. Select the language you want to work with.

  3. Choose Download and select Download Audio from the dropdown.

Edit target language subtitles

  1. After analyzing your asset, choose the Translation tab for your content.

  2. Select the target language you want to work with.

  3. Select a subtitle. When you select a subtitle, the video advances to the location of the subtitle.

  4. Select the form box you want to edit for a specific subtitle (start time, end time or subtitle content) and enter the new value.

  5. To save the changes you made to the subtitles, choose Save Edits at the bottom of the page. The solution saves the new subtitles to the Media Insights on AWS data plane (in WebCaptions format) and reprocesses any downstream operators for this asset that take the translated subtitles (WebCaptions_<target-language>) as an input.

Use corrections to target language subtitles to create an Amazon Translate terminology

You can use the corrections you make while editing target-language subtitles to generate an Amazon Translate terminology that you can use in future workflows to customize the Amazon Translate results for your content.

  1. After making edits in the Subtitles tab for your content, choose Save Terminology at the bottom of the page.

  2. (Optional) To add to an existing terminology, use the radio button to select it before editing. You can combine existing terminologies by selecting multiple radio buttons. This populates the vocabulary table in the form with the content of the existing terminologies plus any edits you made to the subtitles.

  3. The form presents you with a table that is pre-populated with corrections you have made to the subtitles for the asset. You can further modify the rows in the table by selecting a cell and entering new values. Add and delete rows using the (+) and (-) buttons at the end of each row. Add and delete languages in the form using the Add Language and Remove Language buttons.

  4. The terminology table contains columns for the source language and for each target language specified in the workflow. Any target-language edits for a phrase will be filled in in the table. You must fill in all of the cells for other languages you want to include in the terminology or remove the language from the table.

  5. Enter the name you want to use for this terminology. If you use the name of an existing terminology, it will be replaced with the contents from this form.

  6. Choose Save.

Content Localization on AWS web application Custom Terminology Editor dialog box.

Custom Terminology Editor dialog box

Create user for the application

If more than one person needs access to the web application, the solution administrator can set up additional users in Amazon Cognito.

  1. Sign in to the Amazon Cognito console.

  2. Choose Manage User Pools.

  3. In the Your User Pools page, select the name of the user pool containing the prefix MI.

  4. On the MieUserPool page, from the left navigation pane, choose Users and Groups.

  5. On the Users tab, choose Create user.

  6. In the Create user dialog box:

    1. Enter a username.

    2. Enter a temporary password. Verify the options to send an invitation to the user and the verifications for phone number and email are not selected.

    3. Choose Create user.

  7. On the MieUserPool page, under the Username column, select the user you just created.

    
            Amazon Cognito user pools page.

    Amazon Cognito user pools page

  8. On the Users page, choose Add to group.

  9. In the Add user dialog box, access the drop-down list and select MieDevelopersGroup.

Amazon Cognito Add user to group dialog box.

Amazon Cognito Add user to group dialog box

The user can now access the web application, upload media files, and run the analysis workflows.