Getting user segments with a batch segment job
If you used a USER_SEGMENTATION recipe, you can create batch segment jobs to get user segments with your solution version. Each user segment is sorted in descending order based on the probability that each user will interact with items in your inventory. Depending on the recipe, your input data must be a list of items (Item-Affinity recipe) or item attributes (Item-Attribute-Affinity recipe) in JSON format. You can create a batch segment job with the Amazon Personalize console, the AWS Command Line Interface (AWS CLI), or AWS SDKs.
When you create a batch segment job, you specify the Amazon S3 paths to your input and output locations.
Amazon S3 is prefix based. If you provide a prefix for the input data location, Amazon Personalize uses all files matching that prefix as input data.
For example, if you provide s3://amzn-s3-demo-bucket/folderName
and your bucket also has a folder with a path of s3://amzn-s3-demo-bucket/folderName_test
,
Amazon Personalize uses all files in both folders as input data. To use only the files within a specific folder as input data,
end the Amazon S3 path with a prefix delimiter, such as /
: s3://amzn-s3-demo-bucket/folderName/
For more information about how Amazon S3 organizes objects, see Organizing, listing, and working with your objects.
Topics
Creating a batch segment job (console)
After you have completed Preparing input data for batch recommendations, you are ready to create a batch segment job. This procedure assumes that you have already created a solution and a solution version (trained model) with a USER_SEGEMENTATION recipe.
To get create a batch segment job (console)
-
Open the Amazon Personalize console at https://console.aws.amazon.com/personalize/home
and sign in to your account. -
On the Datasets group page, choose your dataset group.
-
Choose batch segment jobs in the navigation pane, then choose Create batch segment job.
-
In batch segment job details, for Batch segment job name, specify a name for your batch segment job.
-
For Solution, choose the solution and then choose the Solution version ID that you want to use to generate the recommendations. You can create batch segment jobs only if you used a USER_SEGEMENTATION recipe.
-
For Number of users, optionally specify the number of users Amazon Personalize generates for each user segment. The default is 25. The maximum is 5 million.
-
For Input source, specify the Amazon S3 path to your input file or use the Browse S3 to choose your Amazon S3 bucket.
Use the following syntax:
s3://amzn-s3-demo-bucket/<folder name>/<input JSON file name>.json
Your input data must be in the correct format for the recipe your solution uses. For input data examples see Batch segment job input and output JSON examples.
-
For Output destination, specify the path to your output location or use the Browse S3 to choose your Amazon S3 bucket. We recommend using a different location for your output data (either a folder or a different Amazon S3 bucket).
Use the following syntax:
s3://amzn-s3-demo-bucket/<output folder name>/
-
For IAM role, choose one of the following:
-
Choose Create and use new service role and enter the Service role name to create a new role, or
-
If you've already created a role with the correct permissions, choose Use an existing service role and choose the IAM role.
The role you use must have read and write access to your input and output Amazon S3 buckets respectively.
-
-
For Filter configuration optionally choose a filter to apply a filter to the user segments. If your filter uses placeholder parameters, make sure the values for the parameters are included in your input JSON. For more information, see Providing filter values in your input JSON.
-
For Tags, optionally add any tags. For more information about tagging Amazon Personalize resources, see Tagging Amazon Personalize resources.
-
Choose Create batch segment job. Batch segment job creation starts and the Batch segment jobs page appears with the Batch segment job detail section displayed.
-
When the batch segment job's status changes to Active, you can retrieve the job's output from the designated output Amazon S3 bucket. The output file's name will be of the format
.input-name
.out
Creating a batch segment job (AWS CLI)
After you have completed Preparing input data for batch recommendations, you are ready to
create a batch segment job using the following create-batch-segment-job
code.
Specify a job name, replace Solution version ARN
with the Amazon Resource Name (ARN) of your solution version,
and replace the IAM service role ARN
with the ARN of the IAM service role you created for Amazon Personalize during set up. This role must have read and write access to your input and output Amazon S3 buckets
respectively. For num-results
specify the number of users you want Amazon Personalize to predict for each line of input data. The default is 25.
The maximum is 5 million.
Optionally provide a filter-arn
to filter user segments. If your filter uses placeholder parameters,
make sure the values for the parameters are included in your input JSON. For more information, see Filtering batch recommendations and user segments (custom resources).
Replace S3 input path
and S3 output path
with the Amazon S3 path to your input file and output locations. We recommend
using a different location for your output data (either a folder or a different Amazon S3 bucket).
Use the following syntax for input and output locations:
s3://amzn-s3-demo-bucket/<folder name>/<input JSON file name>.json
and s3://amzn-s3-demo-bucket/<output folder name>/
.
aws personalize create-batch-segment-job \ --job-name
Job name
\ --solution-version-arnSolution version ARN
\ --num-resultsThe number of predicted users
\ --filter-arnFilter ARN
\ --job-input s3DataSource={path=s3://S3 input path
} \ --job-output s3DataDestination={path=s3://S3 output path
} \ --role-arnIAM service role ARN
{ "batchSegmentJobArn": "arn:aws:personalize:us-west-2:acct-id:batch-segment-job/batchSegmentJobName" }
Creating a batch segment job (AWS SDKs)
After you have completed Preparing input data for batch recommendations, you are ready to
create a batch segment job with the CreateBatchSegmentJob
operation. The following code shows how to create a batch segment job.
Give the job a name, specify the Amazon Resource Name (ARN) of the solution version to use,
specify the ARN for your Amazon Personalize IAM role, and specify the Amazon S3 path to your input file and output locations. Your IAM service role must have read and write access to your input and output Amazon S3 buckets
respectively.
We recommend
using a different location for your output data (either a folder or a different Amazon S3 bucket). Use the following syntax for input and output locations:
s3://amzn-s3-demo-bucket/<folder name>/<input JSON file name>.json
and s3://amzn-s3-demo-bucket/<output folder name>/
.
For numResults
, specify the number of users you want Amazon Personalize to predict for each line of input data. The default is 25.
The maximum is 5 million.
Optionally provide a filterArn
to filter user segments. If your filter uses placeholder parameters,
make sure the values for the parameters are included in your input JSON. For more information, see Filtering batch recommendations and user segments (custom resources).
Processing the batch job might take a while to complete. You can check a job's status by
calling DescribeBatchSegmentJob and passing a
batchSegmentJobArn
as the input parameter. You can also list all
Amazon Personalize batch segment jobs in your AWS environment by calling ListBatchSegmentJobs.