Creating a batch inference job
Create a batch inference job to get batch item recommendations for users based on input data from Amazon S3. The input data can be a list of users or items (or both) in JSON format. You can create a batch inference job with the Amazon Personalize console, the AWS Command Line Interface (AWS CLI), or AWS SDKs.
For more information about the batch workflow in Amazon Personalize, including permissions requirements, recommendation scoring, and preparing and importing input data, see Getting batch recommendations and user segments.
Topics
Creating a batch inference job (console)
After you have completed Preparing and importing batch input data, you are ready to create a batch inference job. This procedure assumes that you have already created a solution and a solution version (trained model).
To create a batch inference job (console)
-
Open the Amazon Personalize console at https://console.aws.amazon.com/personalize/home
and sign in to your account. -
On the Dataset groups page, choose your dataset group.
-
Choose Batch inference jobs in the navigation pane, then choose Create batch inference job.
-
In Batch inference job details, in Batch inference job name, specify a name for your batch inference job.
-
For IAM service role, choose the IAM service role you created for Amazon Personalize during set up. This role must have read and write access to your input and output Amazon S3 buckets respectively.
-
For Solution, choose the solution and then choose the Solution version ID that you want to use to generate the recommendations.
-
For Number of results, optionally specify the number of recommendations for each line of input data. The default is 25.
-
For Input data configuration, specify the Amazon S3 path to your input file.
Use the following syntax:
s3://<name of your S3 bucket>/<folder name>/<input JSON file name>
Your input data must be in the correct format for the recipe your solution uses. For input data examples see Input and output JSON examples.
-
For Output data configuration, specify the path to your output location. We recommend using a different location for your output data (either a folder or a different Amazon S3 bucket).
Use the following syntax:
s3://<name of your S3 bucket>/<output folder name>/
-
For Filter configuration optionally choose a filter to apply a filter to the batch recommendations. If your filter uses placeholder parameters, make sure the values for the parameters are included in your input JSON. For more information see Providing filter values in your input JSON.
-
For Tags, optionally add any tags. For more information about tagging Amazon Personalize resources, see Tagging Amazon Personalize resources.
-
Choose Create batch inference job. Batch inference job creation starts and the Batch inference jobs page appears with the Batch inference job detail section displayed.
-
When the batch inference job's status changes to Active, you can retrieve the job's output from the designated output Amazon S3 bucket. The output file's name will be of the format
.input-name
.out
Creating a batch inference job (AWS CLI)
After you have completed Preparing and importing batch input data, you are ready to
create a batch inference job using the following create-batch-inference-job
code.
Specify a job name, replace Solution version ARN
with the Amazon Resource Name (ARN) of your solution version,
and replace the IAM service role ARN
with the ARN of the IAM service role you created for Amazon Personalize during set up. This role must have read and write access to your input and output Amazon S3 buckets
respectively. Optionally provide a filter ARN to filter recommendations. If your filter uses placeholder parameters,
make sure the values for the parameters are included in your input JSON. For more information see Filtering batch recommendations and user segments.
Replace S3 input path
and S3 output path
with the Amazon S3 path to your input file and output locations. We recommend
using a different location for your output data (either a folder or a different Amazon S3 bucket).
Use the following syntax for input and output locations:
s3://<name of your S3 bucket>/<folder name>/<input JSON file name>
and s3://<name of your S3 bucket>/<output folder name>/
.
The example includes optional User-Personalization recipe specific
itemExplorationConfig
hyperparameters: explorationWeight
and
explorationItemAgeCutOff
. Optionally include explorationWeight
and
explorationItemAgeCutOff
values to configure exploration.
For more information, see
User-Personalization recipe.
aws personalize create-batch-inference-job \ --job-name
Batch job name
\ --solution-version-arnSolution version ARN
\ --filter-arnFilter ARN
\ --job-input s3DataSource={path=s3://S3 input path
} \ --job-output s3DataDestination={path=s3://S3 output path
} \ --role-arnIAM service role ARN
\ --batch-inference-job-config "{\"itemExplorationConfig\":{\"explorationWeight\":\"0.3
\",\"explorationItemAgeCutOff\":\"30
\"}}"{ "batchInferenceJobArn": "arn:aws:personalize:us-west-2:acct-id:batch-inference-job/batchInferenceJobName" }
Creating a batch inference job (AWS SDKs)
After you have completed Preparing and importing batch input data, you are ready to create a batch inference job with the CreateBatchInferenceJob operation.
The following code shows how to create a batch inference job. Specify a job name, the Amazon Resource Name (ARN) of your solution version, and the ARN of the IAM service role you created for Amazon Personalize during set up. This role must have read and write access to your input and output Amazon S3 buckets.
We recommend
using a different location for your output data (either a folder or a different Amazon S3 bucket). Use the following syntax for input and output locations:
s3://<name of your S3 bucket>/<folder name>/<input JSON file name>
and s3://<name of your S3 bucket>/<output folder name>/
.
For numResults
, specify the number of items you want Amazon Personalize to predict for each line of input data.
Optionally provide a filter ARN to filter recommendations. If your filter uses placeholder parameters,
make sure the values for the parameters are included in your input JSON. For more information see Filtering batch recommendations and user segments.
Processing the batch job might take a while to complete. You can check a job's status by
calling DescribeBatchInferenceJob and passing a
batchRecommendationsJobArn
as the input parameter. You can also list all
Amazon Personalize batch inference jobs in your AWS environment by calling ListBatchInferenceJobs.