Amazon Personalize
Developer Guide

This is prerelease documentation for a service in preview release. It is subject to change.

We made breaking changes to the Amazon Personalize API and service model on 3/26/19. To continue using Amazon Personalize with the AWS Command Line Interface or AWS SDK for Python (Boto 3), update your service JSON files by doing steps 3-6 of Setting Up the AWS CLI.

Recording Events

Amazon Personalize can make recommendations based purely on historical imported data as demonstrated in the Getting Started guides. Amazon Personalize can also make recommendations purely on real time clickstream data, or a combination of both, using the Amazon Personalize event ingestion SDK.

Unlike historical data, after a campaign is created, new recorded event data is automatically used when getting recommendations from the campaign.

Note

A minimum of 1000 records of combined interaction data and at least 25 unique users is required to train a model.

The event ingestion SDK includes a JavaScript library for recording events from web client applications. The SDK also includes a library for recording events in server code.

To record events, you need the following:

  • A dataset group, which can be empty.

  • An event tracker with the appropriate AWS Identity and Access Management (IAM) permissions. The IAM role allows Amazon CloudWatch to calculate the clickthrough rate for the events.

  • A call to the PutEvents operation.

Creating a Dataset Group

If you went through the Getting Started guide, you can use the same dataset group that you created, or you can create a new dataset group as shown below. The dataset group can be empty or the group can contain any of the user defined datasets. For more information, see Datasets and Dataset Groups.

PythonAWS CLI
Python
import boto3 personalize = boto3.client('personalize') response = personalize.create_dataset_group(name='MovieClickGroup') print(response['datasetGroupArn'])
AWS CLI
aws personalize create-dataset-group --name MovieClickGroup

The dataset group Amazon Resource Name (ARN) is displayed, for example:

{ "datasetGroupArn": "arn:aws:personalize:us-west-2:acct-id:dataset-group/MovieClickGroup" }

Getting a Tracking ID

A tracking ID associates an event with a dataset group and authorizes you to send data to Amazon Personalize. You generate a tracking ID by calling the CreateEventTracker API. You supply the dataset group ARN and the ARN of the IAM role that you created in Creating an IAM Role.

Note

Only one event tracker can be associated with a dataset group. You will get an error if you call CreateEventTracker using the same dataset group as an existing event tracker.

PythonAWS CLI
Python
import boto3 personalize = boto3.client('personalize') response = personalize.create_event_tracker( name='MovieClickTracker', datasetGroupArn='arn:aws:personalize:us-west-2:acct-id:dataset-group/MovieClickGroup', roleArn='role-arn' ) print(response['eventTrackerArn']) print(response['trackingId'])
AWS CLI
aws personalize create-event-tracker \ --name MovieClickTracker \ --dataset-group-arn arn:aws:personalize:us-west-2:acct-id:dataset-group/MovieClickGroup \ --role-arn role-arn

The event tracker ARN and tracking ID are displayed, for example:

{ "eventTrackerArn": "arn:aws:personalize:us-west-2:acct-id:event-tracker/MovieClickTracker", "trackingId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" }

Event-Interactions Dataset

When Amazon Personalize creates an event tracker, it also creates an event-interactions dataset in the dataset group associated with the event tracker. The event-interactions dataset stores the event data from the PutEvents call. The contents of the dataset are not available to the user.

To view the properties of the dataset, call the ListDatasets API, supplying the dataset group ARN. For additional information about the dataset, use the dataset ARN for the EVENT_INTERACTIONS dataset to call the DescribeDataset API. The following is an example response from ListDatasets:

{ "datasets": [ { "name": "ratings-dsgroup/EVENT_INTERACTIONS", "datasetArn": "arn:aws:personalize:us-west-2:acct-id:dataset/MovieClickGroup/EVENT_INTERACTIONS", "datasetType": "EVENT_INTERACTIONS", "status": "ACTIVE", "creationDateTime": 1554304597.806, "lastUpdatedDateTime": 1554304597.806 }, { "name": "ratings-dataset", "datasetArn": "arn:aws:personalize:us-west-2:acct-id:dataset/MovieClickGroup/INTERACTIONS", "datasetType": "INTERACTIONS", "status": "ACTIVE", "creationDateTime": 1554299406.53, "lastUpdatedDateTime": 1554299406.53 } ], "nextToken": "..." }

PutEvents Operation

To record events, you call the PutEvents operation. The following example shows a PutEvents call that passes one event that contains the minimum required information. The corresponding Interactions schema is shown, along with an example row from the Interactions dataset.

The session ID is custom to your application. The event list is an array of Event objects. The properties key is a string map (key-value pairs) of event-specific data. In this case, just the item ID is specified.

The userId, itemId, and sentAt parameters map to the USER_ID, ITEM_ID, and TIMESTAMP fields of a corresponding historical Interactions dataset. For more information, see Datasets and Schemas.

Note

You can also use AWS Amplify to send to event data to Amazon Personalize. For more information, see Analytics.

Interactions schema: USER_ID, ITEM_ID, TIMESTAMP Interactions dataset: user123, item-xyz, 1543631760
PythonAWS CLI
Python
import boto3 personalize_events = boto3.client(service_name='personalize-events') personalize_events.put_events( trackingId = 'tracking_id', userId= 'USER_ID', sessionId = 'session_id', eventList = [{ 'sentAt': TIMESTAMP, 'eventType': 'EVENT_TYPE', 'properties': "{\"itemId\": \"ITEM_ID\"}" }] )
AWS CLI
aws personalize-events put-events \ --tracking-id tracking_id \ --user-id USER_ID \ --session-id session_id \ --event-list '[{ "sentAt": "TIMESTAMP", "eventType": "EVENT_TYPE", "properties": "{\"itemId\": \"ITEM_ID\"}" }]'

In the previous example, a model would be trained based on the fact that an event occurred, not on the value associated with the event (because an event value wasn't included). The next example shows how to submit data that does train on the event value. It also demonstrates the passing of multiple events of different types ('like' and 'rating'). In this case, you must specify the event type to train on in the CreateSolution operation (see below). The example also shows the recording of an extra property that is used as metadata by certain recipes.

Interactions schema: USER_ID, ITEM_ID, TIMESTAMP, EVENT_TYPE, EVENT_VALUE, NUM_RATINGS Interactions dataset: user123, movie_xyz, 1543531139, rating, 5, 12 user321, choc-ghana, 1543531760, like, true user111, choc-fake, 1543557118, like, false
PythonAWS CLI
Python
import boto3 import json personalize_events = boto3.client(service_name='personalize-events') personalize_events.put_events( trackingId = 'tracking_id', userId= 'user555', sessionId = 'session1', eventList = [{ 'eventId': 'event1', 'sentAt': '1553631760', 'eventType': 'like', 'properties': json.dumps({ 'itemId': 'choc-panama', 'eventValue': 'true' }) }, { 'eventId': 'event2', 'sentAt': '1553631782', 'eventType': 'rating', 'properties': json.dumps({ 'itemId': 'movie_ten', 'eventValue': '4', 'numRatings': '13' }) }] )
AWS CLI
aws personalize-events put-events \ --tracking-id tracking_id \ --user-id user555 \ --session-id session1 \ --event-list '[{ "eventId": "event1", "sentAt": "1553631760", "eventType": "like", "properties": "{\"itemId\": \"choc-panama\", \"eventValue\": \"true\"}" }, { "eventId": "event2", "sentAt": "1553631782", "eventType": "rating", "properties": "{\"itemId\": \"movie_ten\", \"eventValue\": \"4\", \"numRatings\": \"13\"}" }]'

Note

The properties keys use camel case names that match the fields in the Interactions schema. For example, if the fields 'ITEM_ID', 'EVENT_VALUE', and 'NUM_RATINGS,' are defined in the Interactions schema, the property keys should be itemId, eventValue, and numRatings.

Event Metrics

To monitor the type and number of events sent to Amazon Personalize, use Amazon CloudWatch metrics. For more information, see CloudWatch Metrics for Amazon Personalize.

Creating an Events Solution

When training a model that uses event data, two parameters of the CreateSolution operation are relevant. The eventType parameter must be specified when multiple event types are recorded. The eventType indicates which type of event Amazon Personalize uses for model training.

The eventValueThreshold parameter of the SolutionConfig object creates an event filter. When this parameter is specified, only events with a value greater than or equal to the threshold are used for training the model. You must specify the event type when using eventValueThreshold.

Lambda Setup (Preview)

To call Amazon Personalize from your AWS Lambda function, follow these instructions.

Add the JSON services files that you downloaded in Setting Up the AWS CLI to a models folder under the Lambda package root folder.

Add the following code to your Lambda handler before your Amazon Personalize code.

import os lambda_root = os.environ.get('LAMBDA_TASK_ROOT') models_path = os.path.join(lambda_root, 'models') aws_data_path = set(os.environ.get('AWS_DATA_PATH', '').split(os.pathsep)) aws_data_path.add(models_path) os.environ.update({ 'AWS_DATA_PATH': os.pathsep.join(aws_data_path) })