Recording events - Amazon Personalize

Recording events

Amazon Personalize can make recommendations based on real-time event data only, historical event data only (see Importing bulk records), or a mixture of both. Record events in real-time so Amazon Personalize can learn from your user’s most recent activity and update recommendations as they use your application. This keeps your interactions data fresh and improves the relevance of Amazon Personalize recommendations.

You can record real-time events using the AWS SDKs, AWS Amplify or AWS Command Line Interface (AWS CLI). When you record events, Amazon Personalize appends the event data to the Interactions dataset in your dataset group.

Note

AWS Amplify includes a JavaScript library for recording events from web client applications, and a library for recording events in server code. For more information, see Amplify - analytics

Requirements for recording events and training a model

To record events, you need the following:

  • A dataset group that includes an Interactions dataset, which can be empty. If you went through the Getting started guide, you can use the same dataset group and dataset that you created. For information on creating a dataset group and a dataset, see Preparing and importing data.

  • An event tracker.

  • A call to the PutEvents operation.

You can start out with an empty Interactions dataset and, when you have recorded enough data, train the model using only new recorded events. The minimum data requirements to train a model are:

  • 1000 records of combined interaction data (after filtering by eventType and eventValueThreshold, if provided)

  • 25 unique users with at least 2 interactions each

How real-time events influence recommendations

Once you create a campaign, Amazon Personalize automatically uses new recorded event data for existing items (items you included in the data you used to train the latest model) when generating recommendations for the user. This does not require retraining the model (unless you are using the SIMS or Popularity-Count recipes).

Instead, Amazon Personalize adds the new recorded event data to the user's history. Amazon Personalize then uses the modified data when generating recommendations for the user (and this user only).

  • For recorded events for new items (items you did not include in the data you used to train the model), if you trained your model (solution version) with the User-Personalization recipe, Amazon Personalize automatically updates the model every two hours, and after each update the new items influence recommendations. See User-Personalization recipe.

    For any other recipe, you must re-train the model for the new records to influence recommendations. Amazon Personalize stores recorded events for new items and, once you create a new solution version (train a new model), this new data will influence Amazon Personalize recommendations for the user.

  • For recorded events for new users (users that were not included in the data you used to train the model), recommendations will initially be for popular items only. Recommendations will be more relevant as you record more events for the user. Amazon Personalize stores the new user data, so you can also retrain the model for more relevant recommendations.

    For new, anonymous users (users without a userId), Amazon Personalize uses the sessionId you pass in the PutEvents operation to associate events with the user before they log in. This creates a continuous event history that includes events that occurred when the user was anonymous.

Creating an event tracker

An event tracker specifies a destination dataset group for new event data. To create an event tracker you call the CreateEventTracker API operation and pass as a parameter the Amazon Resource Name (ARN) of the dataset group that contains the target Interactions dataset. For instructions on creating an event tracker using the Amazon Personalize console, see Creating an event tracker (console).

When you create an event tracker, the response includes a tracking ID, which you pass as a parameter when you use the PutEvents operation. Amazon Personalize then appends the new event data to the Interactions dataset of the dataset group you specify in your event tracker.

Note

You can create only one event tracker for a dataset group.

Python
import boto3 personalize = boto3.client('personalize') response = personalize.create_event_tracker( name='MovieClickTracker', datasetGroupArn='arn:aws:personalize:us-west-2:acct-id:dataset-group/MovieClickGroup' ) print(response['eventTrackerArn']) print(response['trackingId'])
AWS CLI
aws personalize create-event-tracker \ --name MovieClickTracker \ --dataset-group-arn arn:aws:personalize:us-west-2:acct-id:dataset-group/MovieClickGroup

The event tracker ARN and tracking ID display, for example:

{ "eventTrackerArn": "arn:aws:personalize:us-west-2:acct-id:event-tracker/MovieClickTracker", "trackingId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" }

PutEvents operation

To record events, you call the PutEvents operation. The following example shows a PutEvents operation that passes one event that contains the minimum required information. The corresponding Interactions schema is shown, along with an example row from the Interactions dataset.

Your application generates the sessionId when a user first visits your website or uses your application. Amazon Personalize uses the sessionId to associate events with the user before they log in (is anonymous). After the user logs in and you send an event including the userId, Amazon Personalize associates the previously anonymous historical event data with their userId by matching the sessionId. This creates a continuous event history that includes events that occurred when the user was anonymous.

The event list is an array of Event objects. An eventType is required for each event, but in this example, eventType data is not used in training because it is not included in the schema. You can provide a placeholder value to satisfy the requirement. The properties key is a string map (key-value pairs) of event-specific data. In this case, just the item ID is specified.

The userId, itemId, and sentAt parameters map to the USER_ID, ITEM_ID, and TIMESTAMP fields of a corresponding historical Interactions dataset. For more information, see Datasets and schemas.

Corresponding interactions schema

Interactions schema: USER_ID, ITEM_ID, TIMESTAMP Interactions dataset: user123, item-xyz, 1543631760

Code example

Python
import boto3 personalize_events = boto3.client(service_name='personalize-events') personalize_events.put_events( trackingId = 'tracking_id', userId= 'USER_ID', sessionId = 'session_id', eventList = [{ 'sentAt': TIMESTAMP, 'eventType': 'EVENT_TYPE', 'properties': "{\"itemId\": \"ITEM_ID\"}" }] )
AWS CLI
aws personalize-events put-events \ --tracking-id tracking_id \ --user-id USER_ID \ --session-id session_id \ --event-list '[{ "sentAt": "TIMESTAMP", "eventType": "EVENT_TYPE", "properties": "{\"itemId\": \"ITEM_ID\"}" }]'

After this example, you would proceed to train a model using only the required properties.

The next example shows how to submit data that would train on the event value. It also demonstrates the passing of multiple events of different types ('like' and 'rating'). In this case, you must specify the event type to train on in the CreateSolution operation (see Events and Solutions below). The example also shows the recording of an extra property, numRatings, that is used as metadata by certain recipes.

Interactions schema: USER_ID, ITEM_ID, TIMESTAMP, EVENT_TYPE, EVENT_VALUE, NUM_RATINGS Interactions dataset: user123, movie_xyz, 1543531139, rating, 5, 12 user321, choc-ghana, 1543531760, like, 4 user111, choc-fake, 1543557118, like, 3
Python
import boto3 import json personalize_events = boto3.client(service_name='personalize-events') personalize_events.put_events( trackingId = 'tracking_id', userId= 'user555', sessionId = 'session1', eventList = [{ 'eventId': 'event1', 'sentAt': '1553631760', 'eventType': 'like', 'properties': json.dumps({ 'itemId': 'choc-panama', 'eventValue': 4, 'numRatings': 0 }) }, { 'eventId': 'event2', 'sentAt': '1553631782', 'eventType': 'rating', 'properties': json.dumps({ 'itemId': 'movie_ten', 'eventValue': 3, 'numRatings': 13 }) }] )
AWS CLI
aws personalize-events put-events \ --tracking-id tracking_id \ --user-id user555 \ --session-id session1 \ --event-list '[{ "eventId": "event1", "sentAt": "1553631760", "eventType": "like", "properties": "{\"itemId\": \"choc-panama\", \"eventValue\": \"true\"}" }, { "eventId": "event2", "sentAt": "1553631782", "eventType": "rating", "properties": "{\"itemId\": \"movie_ten\", \"eventValue\": \"4\", \"numRatings\": \"13\"}" }]'
Note

The properties keys use camel case names that match the fields in the Interactions schema. For example, if the fields 'ITEM_ID', 'EVENT_VALUE', and 'NUM_RATINGS,' are defined in the Interactions schema, the property keys should be itemId, eventValue, and numRatings.

Recording impressions data

If you use the User-Personalization recipe, you can record impressions data in your PutEvents operation. Impressions are lists of items that were visible to a user when they interacted with (for example, clicked or watched) a particular item. Amazon Personalize uses impressions data to guide exploration, where recommendations include items with less interactions data or relevance. For information on the implicit and explicit impressions Amazon Personalize can model, see Impressions data.

Important

If you provide conflicting implicit and explicit impression data in your PutEvents requests, Amazon Personalize uses the explicit impressions by default.

To record the Amazon Personalize recommendations you show your user as impressions data, include the recommendationId in your PutEvents request and Amazon Personalize derives the implicit impressions based on your recommendation data.

To manually record impressions data for an event, list the impressions in the PutEvents command's impression input parameter. The following code sample shows how to include a recommendationId and an impression in a PutEvents operation. If you include both, Amazon Personalize uses the explicit impressions by default.

import boto3 personalize_events = boto3.client(service_name='personalize-events') personalize_events.put_events( trackingId = 'tracking_id', userId= 'userId', sessionId = 'sessionId', eventList = [{ 'eventId': 'event1', 'eventType': 'rating', 'sentAt': 1553631760, 'itemId': 'item id', 'recommendationId': 'recommendation id', 'impression': ['itemId1', 'itemId2', 'itemId3'] }] )

Event metrics

To monitor the type and number of events sent to Amazon Personalize, use Amazon CloudWatch metrics. For more information, see Monitoring Amazon Personalize.

Events and solutions

When training a model that uses event data, two parameters of the CreateSolution operation are relevant. The eventType parameter must be specified when multiple event types are recorded. The eventType indicates which type of event Amazon Personalize uses as the label for model training.

The eventValueThreshold parameter of the SolutionConfig object creates an event filter. When this parameter is specified, only events with a value greater than or equal to the threshold are used for training the model. You must specify the event type when using eventValueThreshold.

Sample Jupyter notebook

For a sample Jupyter notebook that shows how to use Amazon Personalize to react to real-time behavior of users using an event tracker and the PutEvents operation, see 2.View_Campaign_And_Interactions.ipynb in the getting_started folder of the amazon-personalize-samples GitHub repository.