Amazon Personalize
Developer Guide

This is prerelease documentation for a service in preview release. It is subject to change.

We made breaking changes to the Amazon Personalize API and service model on 02/20/19. To continue using Amazon Personalize with the AWS Command Line Interface or AWS SDK for Python (Boto 3), update your service JSON files by doing steps 3-6 of Setting Up the AWS CLI.

Recording Events

Amazon Personalize can make recommendations based purely on historical data as demonstrated in the Getting Started guides. Amazon Personalize can also make recommendations purely on real time clickstream data, or a combination of both, using the Amazon Personalize event ingestion SDK.

Note

A minimum of 1000 records of combined interaction data is required to train a model.

The event ingestion SDK includes a JavaScript library for recording events from web client applications. The SDK also includes a library for recording events in server code.

To record events, you need the following:

  • A dataset group, which can be empty.

  • An event tracker with the appropriate AWS Identity and Access Management (IAM) permissions. The IAM role allows Amazon CloudWatch to calculate the clickthrough rate for the events.

Creating a Dataset Group

If you went through the Getting Started guide, you can use the same dataset group that you created, or you can create a new dataset group as shown below. The dataset group can be empty or the group can contain any of the user defined datasets. For more information, see Datasets and Dataset Groups.

PythonAWS CLI
Python
import boto3 personalize = boto3.client('personalize') response = personalize.create_dataset_group(name='MovieClickGroup') print(response['datasetGroupArn'])
AWS CLI
aws personalize create-dataset-group --name MovieClickGroup

The dataset group Amazon Resource Name (ARN) is displayed, for example:

{ "datasetGroupArn": "arn:aws:personalize:us-west-2:acct-id:dataset-group/MovieClickGroup" }

Getting a Tracking ID

A tracking ID associates an event with a dataset group and authorizes you to send data to Amazon Personalize. You generate a tracking ID by calling the CreateEventTracker API. You supply the dataset group ARN and the ARN of the IAM role that you created in Creating an IAM Role.

PythonAWS CLI
Python
import boto3 personalize = boto3.client('personalize') response = personalize.create_event_tracker( name='MovieClickTracker', datasetGroupArn='arn:aws:personalize:us-west-2:acct-id:dataset-group/MovieClickGroup', roleArn='role-arn' ) print(response['eventTrackerArn']) print(response['trackingId'])
AWS CLI
aws personalize create-event-tracker \ --name MovieClickTracker \ --dataset-group-arn arn:aws:personalize:us-west-2:acct-id:dataset-group/MovieClickGroup \ --role-arn role-arn

The event tracker ARN and tracking ID are displayed, for example:

{ "eventTrackerArn": "arn:aws:personalize:us-west-2:acct-id:event-tracker/MovieClickTracker", "trackingId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" }

Event-Interactions Dataset

When Amazon Personalize creates an event tracker, it also creates a new event-interactions dataset in the dataset group associated with the event tracker. The event-interactions dataset stores the event data from the Record call. The contents of the dataset are not available to the user.

Note

A dataset group can contain only one user-item iteractions dataset and one event-interactions dataset. You will get an error if you try to call CreateEventTracker using the same dataset group as an existing event tracker even if you use a different event tracker name.

To view the properties of the new dataset, call the ListDatasets API, supplying the dataset group ARN. Use the dataset ARN for the EVENT_INTERACTIONS dataset to call the DescribeDataset API. The following is an example response:

{ "dataset": { "datasetArn": "arn:aws:personalize:us-west-2:acct-id:dataset/dataset-group-name/EVENT_INTERACTIONS", "datasetGroupArn": "arn:aws:personalize:us-west-2:acct-id:dataset-group/MovieClickGroup", "datasetType": "EVENT_INTERACTIONS", "schemaArn": "arn:aws:personalize:us-west-2:acct-id:schema/event-interactions-schema", "status": "ACTIVE", "creationDateTime": 1545694802.016 } }

Event Interactions Schema

To view the schema corresponding to the event-interactions dataset, call the DescribeSchema API, supplying the schema ARN from the previous listing. An example response follows. The event_type field corresponds to the eventName parameter of the Record API. The item_id and event_value fields correspond to the id and value keys of the properties parameter of the Record API.

{ "schema": { "name": "event-interactions-schema", "schemaArn": "arn:aws:personalize:us-west-2:acct-id:schema/event-interactions-schema", "schema": { "type": "record", "name": "Interactions", "namespace": "com.amazonaws.personalize.schema", "fields": [ { "name": "user_id", "type": "string" }, { "name": "session_id", "type": "string" }, { "name": "timestamp", "type": "long" } { "name": "event_type", "type": "string" }, { "name": "item_id", "type": "string" }, { "name": "event_value", "type": "string" }, ], "version": "1.0" }" } }

Calling the Record API

The following shows an example of a Record call. You supply the tracking ID created in the previous step. The user ID and session ID are custom to your application.

The event list is an array of Event types. The event ID is custom to your application. The sentAt and eventName keys correspond to the timestamp and event_type fields of the event-interactions schema (described below).

The properties key is a string map (key-value pairs) of event-specific data. The id and value keys correspond to the item_id and event_value fields of the event-interactions schema.

PythonAWS CLI
Python
import boto3 personalize_events = boto3.client('personalize-events') personalize_events.record( trackingId = 'tracking-id', userId = 'user-id', sessionId = 'session-id', eventList = [ { "eventId": "event1", "sentAt": 1545694248, "eventName": "rating", "properties": """{\"id\": \"101\", \"value\": \"4\"}""" }, { "eventId": "event2", "sentAt": 1545694251, "eventName": "rating", "properties": """{\"id\": \"311\", \"value\": \"2\"}""" } ] )
AWS CLI
aws personalize-events record \ --tracking-id tracking-id \ --user-id user-id \ --session-id session-id \ --event-list [event1, event2, ...]

Event Types

Amazon Personalize recognizes any string value (subject to length constraints) as a valid event type. There are two event types that Amazon Personalize treats as special when using the Javascript SDK:

  • ListView

    For the ListView event type, you use the items key with an array of objects as the value (each object having one or more key-value pairs). For example,

    {items: [{"id": "item1", "value":"value1"}, {"id": "item2", "value":"value2"}, {...}]}

  • MediaAutoTrack

Creating an Events Solution

When training a model that uses event data, two parameters of the CreateSolution API are relevant. The eventType parameter must be specified when multiple event types are recorded. The eventType indicates which type of event Amazon Personalize uses for model training.

The eventValueThreshold parameter of the SolutionConfig object creates an event filter. When this parameter is specified, only events with a value greater than or equal to the threshold are used for training the model. You must specify eventType when using eventValueThreshold.