Build a transcription app with authenticated users - AWS SDK for JavaScript

The AWS SDK for JavaScript V3 API Reference Guide describes in detail all the API operations for the AWS SDK for JavaScript version 3 (V3).

Build a transcription app with authenticated users

In this tutorial, you learn how to:

  • Implement authentication using an Amazon Cognito identity pool to accept users federated with a Amazon Cognito user pool.

  • Use Amazon Transcribe to transcribe and display voice recordings in the browser.

The scenario

The app enables users to sign up with a unique email and username. On confirmation of their email, they can record voice messages that are automatically transcribed and displayed in the app.

How it works

The app uses two Amazon S3 buckets, one to host the application code, and another to store transcriptions. The app uses an Amazon Cognito user pool to authenticate your users. Authenticated users have IAM permissions to access the required AWS services.

The first time a user records a voice message, Amazon S3 creates a unique folder with the user’s name in the Amazon S3 bucket for storing transcriptions. Amazon Transcribe transcribes the voice message to text, and saves it in JSON in the user’s folder. When the user refreshes the app, their transcriptions are displayed and available for downloading or deletion.

The tutorial should take about 30 minutes to complete.

Steps

Prerequisites

  • Set up the project environment to run this Node JavaScript examples, and install the required AWS SDK for JavaScript and third-party modules. Follow the instructions on GitHub.

  • Create a shared configurations file with your user credentials. For more information about providing a shared credentials file, see Shared config and credentials files in the AWS SDKs and Tools Reference Guide.

Important

This example uses ECMAScript6 (ES6). This requires Node.js version 13.x or higher. To download and install the latest version of Node.js, see Node.js downloads..

However, if you prefer to use CommonJS syntax, please refer to JavaScript ES6/CommonJS syntax.

Create the AWS resources

This section describes how to provision AWS resources for this app using the AWS Cloud Development Kit (AWS CDK).

Note

The AWS CDK is a software development framework that enables you to define cloud application resources. For more information, see the AWS Cloud Development Kit (AWS CDK) Developer Guide.

To create resources for the app, use the template here on GitHub to create a AWS CDK stack using either the AWS Web Services Management Console or the AWS CLI. For instructions on how to modify the stack, or to delete the stack and its associated resources when you have finished the tutorial, see here on GitHub.

Note

The stack name must be unique within an AWS Region and AWS account. You can specify up to 128 characters, and numbers and hyphens are allowed.

The resulting stack automatically provisions the following resources.

  • An Amazon Cognito identity pool with an authenticated user role.

  • An IAM policy with permissions for the Amazon S3 and Amazon Transcribe is attached to the authenticated user role.

  • An Amazon Cognito user pool that enables users to sign up and sign in to the app.

  • An Amazon S3 bucket to host the application files.

  • An Amazon S3 bucket to to store the transcriptions.

    Important

    This Amazon S3 bucket allows READ (LIST) public access, which enables anyone to list the objects within the bucket and potentially misuse the information. If you do not delete this Amazon S3 bucket immediately after completing the tutorial, we highly recommend you comply with the Security Best Practices in Amazon S3 in the Amazon Simple Storage Service User Guide.

Create the HTML

Create an index.html file, and copy and paste the content below into it. The page features panel of buttons for recording voice messages, and a table displaying the current user’s previously transcribed messages. The script tag at the end of the body element invokes the main.js, which contain all the browser script for the app. You create the main.js using Webpack, as described in the following section of this tutorial.

<!DOCTYPE html> <html> <head> <meta charset="UTF-8"> <title>title</title> <link rel="stylesheet" type="text/css" href="recorder.css"> <style> table, td { border: 1px solid black; } </style> </head> <body> <h2>Record</h2> <p> <button id="record" onclick="startRecord()"></button> <button id="stopRecord" disabled onclick="stopRecord()">Stop</button> <p id="demo" style="visibility: hidden;"></p> </p> <p> <audio id="recordedAudio"></audio> </p> <h2>My transcriptions</h2> <table id="myTable1" style ="width:678px;"> </table> <table id="myTable" style ="width:678px;"> <tr> <td style = "font-weight:bold">Time created</td> <td style = "font-weight:bold">Transcription</td> <td style = "font-weight:bold">Download</td> <td style = "font-weight:bold">Delete</td> </tr> </table> <script type="text/javascript" src="./main.js"></script> </body> </html>

This code example is available here on GitHub.

Prepare the browser script

There are three files, index.html, recorder.js, and helper.js, which you are required to bundle into a single main.js using Webpack. This section describes in detail only the functions in index.js that use the SDK for JavaScript, which is available here on GitHub.

Note

recorder.js and helper.js are required but, because they do not contain Node.js code, are explained in the inline comments here and here respectively GitHub.

First, define the parameters. COGNITO_ID is the endpoint for the Amazon Cognito User Pool you created in the Create the AWS resources topic of this tutorial. It is formattedcognito-idp.AWS_REGION.amazonaws.com/USER_POOL_ID. The user pool id is ID_TOKEN in the AWS credentials token, which is stripped from the app URL by the getToken function in the 'helper.js' file. This token is passed to the loginData variable, which provides the Amazon Transcribe and Amazon S3 client objects with logins. Replace "REGION" with the AWS Region, and "BUCKET" with the Replace "IDENTITY_POOL_ID" with the IdentityPoolId from the Sample page of the Amazon Cognito identity pool you created for this example. This is also passed to each client object.

// Import the required AWS SDK clients and commands for Node.js import "./helper.js"; import "./recorder.js"; import { CognitoIdentityClient } from "@aws-sdk/client-cognito-identity"; import { fromCognitoIdentityPool } from "@aws-sdk/credential-provider-cognito-identity"; import { CognitoIdentityProviderClient, GetUserCommand, } from "@aws-sdk/client-cognito-identity-provider"; import { S3RequestPresigner } from "@aws-sdk/s3-request-presigner"; import { createRequest } from "@aws-sdk/util-create-request"; import { formatUrl } from "@aws-sdk/util-format-url"; import { TranscribeClient, StartTranscriptionJobCommand, } from "@aws-sdk/client-transcribe"; import { S3Client, PutObjectCommand, GetObjectCommand, ListObjectsCommand, DeleteObjectCommand, } from "@aws-sdk/client-s3"; import fetch from "node-fetch"; // Set the parameters. // 'COGINTO_ID' has the format 'cognito-idp.eu-west-1.amazonaws.com/COGNITO_ID'. let COGNITO_ID = "COGNITO_ID"; // Get the Amazon Cognito ID token for the user. 'getToken()' is in 'helper.js'. let idToken = getToken(); let loginData = { [COGNITO_ID]: idToken, }; const params = { Bucket: "BUCKET", // The Amazon Simple Storage Solution (S3) bucket to store the transcriptions. Region: "REGION", // The AWS Region identityPoolID: "IDENTITY_POOL_ID", // Amazon Cognito Identity Pool ID. }; // Create an Amazon Transcribe service client object. const client = new TranscribeClient({ region: params.Region, credentials: fromCognitoIdentityPool({ client: new CognitoIdentityClient({ region: params.Region }), identityPoolId: params.identityPoolID, logins: loginData, }), }); // Create an Amazon S3 client object. const s3Client = new S3Client({ region: params.Region, credentials: fromCognitoIdentityPool({ client: new CognitoIdentityClient({ region: params.Region }), identityPoolId: params.identityPoolID, logins: loginData, }), });

When the HTML page loads, the updateUserInterface creates a folder with the user's name in the Amazon S3 bucket if its the first time they've signed in to the app. If not, it updates the user interface with any transcripts from the user's previous sessions.

window.onload = async () => { // Set the parameters. const userParams = { // Get the access token. 'GetAccessToken()' is in 'helper.js'. AccessToken: getAccessToken(), }; // Create a CognitoIdentityProviderClient client object. const client = new CognitoIdentityProviderClient({ region: params.Region }); try { const data = await client.send(new GetUserCommand(userParams)); const username = data.Username; // Export username for use in 'recorder.js'. exports.username = username; try { // If this is user's first sign-in, create a folder with user's name in Amazon S3 bucket. // Otherwise, no effect. const Key = `${username}/`; try { const data = await s3Client.send( new PutObjectCommand({ Key: Key, Bucket: params.Bucket }) ); console.log("Folder created for user ", data.Username); } catch (err) { console.log("Error", err); } try { // Get a list of the objects in the Amazon S3 bucket. const data = await s3Client.send( new ListObjectsCommand({ Bucket: params.Bucket, Prefix: username }) ); // Create a variable for the list of objects in the Amazon S3 bucket. const output = data.Contents; // Loop through the objects, populating a row on the user interface for each object. for (var i = 0; i < output.length; i++) { var obj = output[i]; const objectParams = { Bucket: params.Bucket, Key: obj.Key, }; // Get the name of the object from the Amazon S3 bucket. const data = await s3Client.send(new GetObjectCommand(objectParams)); // Extract the body contents, a readable stream, from the returned data. const result = data.Body; // Create a variable for the string version of the readable stream. let stringResult = ""; // Use 'yieldUnit8Chunks' to convert the readable streams into JSON. for await (let chunk of yieldUint8Chunks(result)) { stringResult += String.fromCharCode.apply(null, chunk); } // The setTimeout function waits while readable stream is converted into JSON. setTimeout(function () { // Parse JSON into human readable transcript, which will be displayed on user interface (UI). const outputJSON = JSON.parse(stringResult).results.transcripts[0].transcript; // Create name for transcript, which will be displayed. const outputJSONTime = JSON.parse(stringResult) .jobName.split("/")[0] .replace("-job", ""); i++; // // Display the details for the transcription on the UI. // 'displayTranscriptionDetails()' is in 'helper.js'. displayTranscriptionDetails( i, outputJSONTime, objectParams.Key, outputJSON ); }, 1000); } } catch (err) { console.log("Error", err); } } catch (err) { console.log("Error creating presigned URL", err); } } catch (err) { console.log("Error", err); } }; // Convert readable streams. async function* yieldUint8Chunks(data) { const reader = data.getReader(); try { while (true) { const { done, value } = await reader.read(); if (done) return; yield value; } } finally { reader.releaseLock(); } }

When the user records a voice message for transcriptions, the upload uploads the recordings to the Amazon S3 bucket. This function is called from the recorder.js file.

// Upload recordings to Amazon S3 bucket window.upload = async function (blob, userName) { // Set the parameters for the recording recording. const Key = `${userName}/test-object-${Math.ceil(Math.random() * 10 ** 10)}`; let signedUrl; // Create a presigned URL to upload the transcription to the Amazon S3 bucket when it is ready. try { // Create an Amazon S3RequestPresigner object. const signer = new S3RequestPresigner({ ...s3Client.config }); // Create the request. const request = await createRequest( s3Client, new PutObjectCommand({ Key, Bucket: params.Bucket }) ); // Define the duration until expiration of the presigned URL. const expiration = new Date(Date.now() + 60 * 60 * 1000); // Create and format the presigned URL. signedUrl = formatUrl(await signer.presign(request, expiration)); console.log(`\nPutting "${Key}"`); } catch (err) { console.log("Error creating presigned URL", err); } try { // Upload the object to the Amazon S3 bucket using a presigned URL. response = await fetch(signedUrl, { method: "PUT", headers: { "content-type": "application/octet-stream", }, body: blob, }); // Create the transcription job name. In this case, it's the current date and time. const today = new Date(); const date = today.getFullYear() + "-" + (today.getMonth() + 1) + "-" + today.getDate(); const time = today.getHours() + "-" + today.getMinutes() + "-" + today.getSeconds(); const jobName = date + "-time-" + time; // Call the "createTranscriptionJob()" function. createTranscriptionJob( "s3://" + params.Bucket + "/" + Key, jobName, params.Bucket, Key ); } catch (err) { console.log("Error uploading object", err); } }; // Create the AWS Transcribe transcription job. const createTranscriptionJob = async (recording, jobName, bucket, key) => { // Set the parameters for transcriptions job const params = { TranscriptionJobName: jobName + "-job", LanguageCode: "en-US", // For example, 'en-US', OutputBucketName: bucket, OutputKey: key, Media: { MediaFileUri: recording, // For example, "https://transcribe-demo.s3-REGION.amazonaws.com/hello_world.wav" }, }; try { // Start the transcription job. const data = await client.send(new StartTranscriptionJobCommand(params)); console.log("Success - transcription submitted", data); } catch (err) { console.log("Error", err); } };

deleteTranscription deletes a transcription from the user interface, and deleteRow deletes an existing transcription from the Amazon S3 bucket. Both are triggered by the Delete button on the user interface.

// Delete a transcription from the Amazon S3 bucket. window.deleteJSON = async (jsonFileName) => { try { await s3Client.send( new DeleteObjectCommand({ Bucket: params.Bucket, Key: jsonFileName, }) ); console.log("Success - JSON deleted"); } catch (err) { console.log("Error", err); } }; // Delete a row from the user interface. window.deleteRow = function (rowid) { const row = document.getElementById(rowid); row.parentNode.removeChild(row); };

Finally, run the following at the command prompt to bundle the JavaScript for this example in a file named main.js:

webpack index.js --mode development --target web --devtool false -o main.js
Note

For information about installing webpack, see Bundle applications with webpack.

Run the app

You can the view the app at the location below.

DOMAIN/login?client_id=APP_CLIENT_ID&response_type=token&scope=aws.cognito.signin.user.admin+email+openid+phone+profile&redirect_uri=REDIRECT_URL

Amazon Cognito makes it easy to run the app by providing a link in the AWS Web Services Management Console. Simply navigate to the App client setting of your Amazon Cognito user pool, and select the Launch Hosted UI. The URL for the app has the following format.

Important

The Hosted UI defaults to a response type of 'code'. However, this tutorial is designed for the 'token' response type, so you have to change it.

Delete the AWS resources

When you finish the tutorial, you should delete the resources so you do not incur any unnecessary charges. Because you added content to both Amazon S3 buckets, you must delete them manually. Then you can delete the remaining resources using either the AWS Web Services Management Console or the AWS CLI. Instructions on how to modify the stack, or to delete the stack and its associated resources when you have finished the tutorial, see here on GitHub.