Getting started (AWS CLI)
In this exercise, you use the AWS Command Line Interface (AWS CLI) to explore Amazon Personalize. You create a campaign that returns movie recommendations for a given user ID.
Before you start this exercise, do the following:
-
Review the Getting Started Getting started prerequisites.
-
Set up the AWS CLI, as specified in Setting up the AWS CLI.
When you finish the getting started exercise, to avoid incurring unnecessary charges, follow the steps in Cleaning up resources to delete the resources you created.
Note
The AWS CLI commands in this exercise were tested on Linux. For information about using the AWS CLI commands on Windows, see Specifying parameter values for the AWS Command Line Interface in the AWS Command Line Interface User Guide.
Follow the steps to create a dataset group, add a dataset to the group, and then populate the dataset using the movie ratings data.
-
Create a dataset group by running the following command. You can encrypt the dataset group by passing a AWS Key Management Service key ARN and the ARN of an IAM role that has access permissions to that key as input parameters. For more information about the API, see CreateDatasetGroup.
aws personalize create-dataset-group --name MovieRatingDatasetGroup --kms-key-arn
arn:aws:kms:us-west-2:01234567890:key/1682a1e7-a94d-4d92-bbdf-837d3b62315e
--role-arnarn:aws:iam::01234567890:KMS-key-access
The dataset group ARN is displayed, for example:
{ "datasetGroupArn": "arn:aws:personalize:us-west-2:acct-id:dataset-group/MovieRatingDatasetGroup" }
Use the
describe-dataset-group
command to display the dataset group you created, specifying the returned dataset group ARN.aws personalize describe-dataset-group \ --dataset-group-arn arn:aws:personalize:us-west-2:
acct-id
:dataset-group/MovieRatingDatasetGroupThe dataset group and its properties are displayed, for example:
{ "datasetGroup": { "name": "MovieRatingDatasetGroup", "datasetGroupArn": "arn:aws:personalize:us-west-2:acct-id:dataset-group/MovieRatingDatasetGroup", "status": "ACTIVE", "creationDateTime": 1542392161.262, "lastUpdatedDateTime": 1542396513.377 } }
Note
Wait until the dataset group's
status
shows as ACTIVE before creating a dataset in the group. This operation is usually quick.If you don't remember the dataset group ARN, use the
list-dataset-groups
command to display all the dataset groups that you created, along with their ARNs.aws personalize list-dataset-groups
Note
The
describe-object
andlist-objects
commands are available for most Amazon Personalize objects. These commands are not shown in the remainder of this exercise but they are available. -
Create a schema file in JSON format by saving the following code to a file named
MovieRatingSchema.json
. The schema matches the headers you previously added toratings.csv
. The schema name isInteractions
, which matches one of the types of datasets recognized by Amazon Personalize. For more information, see Schemas.{ "type": "record", "name": "Interactions", "namespace": "com.amazonaws.personalize.schema", "fields": [ { "name": "USER_ID", "type": "string" }, { "name": "ITEM_ID", "type": "string" }, { "name": "TIMESTAMP", "type": "long" } ], "version": "1.0" }
-
Create a schema by running the following command. Specify the file you saved in the previous step. The example shows the file as belonging to the current folder. For more information about the API, see CreateSchema.
aws personalize create-schema \ --name MovieRatingSchema \ --schema file://MovieRatingSchema.json
The schema Amazon Resource Name (ARN) is displayed, for example:
{ "schemaArn": "arn:aws:personalize:us-west-2:acct-id:schema/MovieRatingSchema" }
-
Create an empty dataset by running the following command. Provide the dataset group ARN and schema ARN that were returned in the previous steps. The
dataset-type
must match the schemaname
from the previous step. For more information about the API, see CreateDataset.aws personalize create-dataset \ --name MovieRatingDataset \ --dataset-group-arn arn:aws:personalize:us-west-2:
acct-id
:dataset-group/MovieRatingDatasetGroup \ --dataset-type Interactions \ --schema-arn arn:aws:personalize:us-west-2:acct-id
:schema/MovieRatingSchemaThe dataset ARN is displayed, for example:
{ "datasetArn": "arn:aws:personalize:us-west-2:acct-id:dataset/MovieRatingDatasetGroup/INTERACTIONS" }
-
Add the training data to the dataset.
-
Create a dataset import job by running the following command. Provide the dataset ARN and Amazon S3 bucket name that were returned in the previous steps. Supply the AWS Identity and Access Management (IAM) role ARN you created in Creating an IAM role for Amazon Personalize. For more information about the API, see CreateDatasetImportJob.
aws personalize create-dataset-import-job \ --job-name MovieRatingImportJob \ --dataset-arn arn:aws:personalize:us-west-2:
acct-id
:dataset/MovieRatingDatasetGroup/INTERACTIONS \ --data-source dataLocation=s3://bucketname
/ratings.csv \ --role-arnroleArn
The dataset import job ARN is displayed, for example:
{ "datasetImportJobArn": "arn:aws:personalize:us-west-2:acct-id:dataset-import-job/MovieRatingImportJob" }
-
Check the status by using the
describe-dataset-import-job
command. Provide the dataset import job ARN that was returned in the previous step. For more information about the API, see DescribeDatasetImportJob.aws personalize describe-dataset-import-job \ --dataset-import-job-arn arn:aws:personalize:us-west-2:
acct-id
:dataset-import-job/MovieRatingImportJobThe properties of the dataset import job, including its status, are displayed. Initially, the
status
shows as CREATE PENDING, for example:{ "datasetImportJob": { "jobName": "MovieRatingImportJob", "datasetImportJobArn": "arn:aws:personalize:us-west-2:acct-id:dataset-import-job/MovieRatingImportJob", "datasetArn": "arn:aws:personalize:us-west-2:acct-id:dataset/MovieRatingDatasetGroup/INTERACTIONS", "dataSource": { "dataLocation": "s3://<bucketname>/ratings.csv" }, "roleArn": "role-arn", "status": "CREATE PENDING", "creationDateTime": 1542392161.837, "lastUpdatedDateTime": 1542393013.377 } }
The dataset import is complete when the status shows as ACTIVE. Then you are ready to train the model using the specified dataset.
Note
Importing takes time. Wait until the dataset import is complete before training the model using the dataset.
-
Two steps are required to initially train a model. First, you create the configuration for training the model using the CreateSolution operation. Second, you train the model using the CreateSolutionVersion operation.
You train a model using a recipe and your training data. Amazon Personalize provides a set of predefined recipes. For more information, see Choosing a recipe. For this exercise, you use the User-Personalization recipe.
-
Create the configuration for training a model by running the following command.
aws personalize create-solution \ --name MovieSolution \ --dataset-group-arn arn:aws:personalize:us-west-2:
acct-id
:dataset-group/MovieRatingDatasetGroup \ --recipe-arn arn:aws:personalize:::recipe/aws-user-personalizationThe solution ARN is displayed, for example:
{ "solutionArn": "arn:aws:personalize:us-west-2:acct-id:solution/MovieSolution" }
-
Check the create status using the
describe-solution
command. Provide the solution ARN that was returned in the previous step. For more information about the API, see DescribeSolution.aws personalize describe-solution \ --solution-arn arn:aws:personalize:us-west-2:
acct-id
:solution/MovieSolutionThe properties of the solution and the create
status
are displayed. Initially, the status shows as CREATE PENDING, for example:{ "solution": { "name": "MovieSolution", "solutionArn": "arn:aws:personalize:us-west-2:acct-id:solution/MovieSolution", "performHPO": false, "performAutoML": false, "recipeArn": "arn:aws:personalize:::recipe/aws-user-personalization", "datasetGroupArn": "arn:aws:personalize:us-west-2:acct-id:dataset-group/MovieRatingDatasetGroup", "solutionConfig": {}, "status": "ACTIVE", "creationDateTime": "2021-05-12T16:27:59.819000-07:00", "lastUpdatedDateTime": "2021-05-12T16:27:59.819000-07:00" } }
-
When the solution is ACTIVE, train the model by running the following command.
aws personalize create-solution-version \ --solution-arn arn:aws:personalize:us-west-2:
acct-id
:solution/MovieSolutionThe solution version ARN is displayed, for example:
{ "solutionVersionArn": "arn:aws:personalize:us-west-2:acct-id:solution/MovieSolution/<version-id>" }
Check the training status of the solution version by using the
describe-solution-version
command. Provide the solution version ARN that was returned in the previous step. For more information about the API, see DescribeSolutionVersion.aws personalize describe-solution-version \ --solution-version-arn arn:aws:personalize:us-west-2:
acct-id
:solution/MovieSolution/version-id
The properties of the solution version and the training
status
are displayed. Initially, the status shows as CREATE PENDING, for example:{ "solutionVersion": { "solutionVersionArn": "arn:aws:personalize:us-west-2:acct-id:solution/MovieSolution/<version-id>", ..., "status": "CREATE PENDING" } }
-
When the solution version
status
is ACTIVE, the training is complete.Now you can review training metrics and create a campaign using the solution version.
Note
Training takes time. Wait until training is complete (the training status of the solution version shows as ACTIVE) before using this version of the solution in a campaign.
-
You can validate the performance of the solution version by reviewing its metrics. Get the metrics for the solution version by running the following command. Provide the solution version ARN that was returned previously. For more information about the API, see GetSolutionMetrics.
aws personalize get-solution-metrics \ --solution-version-arn arn:aws:personalize:us-west-2:
acct-id
:solution/MovieSolution/version-id
A sample response is shown:
{ "solutionVersionArn": "arn:aws:personalize:us-west-2:acct-id:solution/www-solution/<version-id>", "metrics": { "coverage": 0.0485, "mean_reciprocal_rank_at_25": 0.0381, "normalized_discounted_cumulative_gain_at_10": 0.0363, "normalized_discounted_cumulative_gain_at_25": 0.0984, "normalized_discounted_cumulative_gain_at_5": 0.0175, "precision_at_10": 0.0107, "precision_at_25": 0.0207, "precision_at_5": 0.0107 } }
Before you can get recommendations, you must deploy a solution version. Deploying a solution is also known as creating a campaign. Once you've created your campaign, your client application can get recommendations using the GetRecommendations API.
-
Create a campaign by running the following command. Provide the solution version ARN that was returned in the previous step. For more information about the API, see CreateCampaign.
aws personalize create-campaign \ --name MovieRecommendationCampaign \ --solution-version-arn arn:aws:personalize:us-west-2:
acct-id
:solution/MovieSolution/version-id
\ --min-provisioned-tps 1A sample response is shown:
{ "campaignArn": "arn:aws:personalize:us-west-2:acct-id:campaign/MovieRecommendationCampaign" }
-
Check the deployment status by running the following command. Provide the campaign ARN that was returned in the previous step. For more information about the API, see DescribeCampaign.
aws personalize describe-campaign \ --campaign-arn arn:aws:personalize:us-west-2:
acct-id
:campaign/MovieRecommendationCampaignA sample response is shown:
{ "campaign": { "name": "MovieRecommendationCampaign", "campaignArn": "arn:aws:personalize:us-west-2:acct-id:campaign/MovieRecommendationCampaign", "solutionVersionArn": "arn:aws:personalize:us-west-2:acct-id:solution/MovieSolution/<version-id>", "minProvisionedTPS": "1", "creationDateTime": 1543864775.923, "lastUpdatedDateTime": 1543864791.923, "status": "CREATE IN_PROGRESS" } }
Note
Wait until the
status
shows as ACTIVE before getting recommendations from the campaign.
Get recommendations by running the get-recommendations
command. Provide the
campaign ARN that was returned in the previous step. In the request, you specify a user ID
from the movie ratings dataset. For more information about the API, see GetRecommendations.
Note
Not all recipes support the GetRecommendations
API. For more information,
see Choosing a
recipe.
The AWS CLI command you call in this step, personalize-runtime
, is
different than in previous steps.
aws personalize-runtime get-recommendations \ --campaign-arn arn:aws:personalize:us-west-2:
acct-id
:campaign/MovieRecommendationCampaign \ --user-id 123
In response, the campaign returns a list of item recommendations (movie IDs) the user might like. The list is sorted in descending order of relevance for the user.
{ "itemList": [ { "itemId": "14" }, { "itemId": "15" }, { "itemId": "275" }, { "itemId": "283" }, { "itemId": "273" }, ... ] }