创建批量推理作业 (AWS SDK) - Amazon Personalize

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

创建批量推理作业 (AWS SDK)

完成为批量建议准备输入数据后,就可以通过 CreateBatchInferenceJob 操作创建批量推理作业了。

创建批量推理作业

可以使用以下代码创建批量推理作业。指定作业名称、解决方案版本的 Amazon 资源名称 (ARN),以及您在设置期间为 Amazon Personalize 创建的 IAM 服务角色的 ARN。此角色必须对您的输入和输出 Amazon S3 存储桶具有读写权限。

我们建议使用不同的输出数据位置(文件夹或其他 Amazon S3 存储桶)。对输入和输出位置使用以下语法:s3://<name of your S3 bucket>/<folder name>/<input JSON file name>.jsons3://<name of your S3 bucket>/<output folder name>/

对于 numResults,指定您希望 Amazon Personalize 为每行输入数据预测的物品数量。(可选)提供筛选器 ARN 来筛选建议。如果您的筛选器使用占位符参数,请确保这些参数的值包含在您的输入 JSON 中。有关更多信息,请参阅 筛选批量建议和用户细分(自定义资源)

SDK for Python (Boto3)

该示例包括可选的 User-Personalization 食谱特定的 itemExplorationConfig 超参数:explorationWeightexplorationItemAgeCutOff。(可选)包括 explorationWeightexplorationItemAgeCutOff 值以配置浏览。有关更多信息,请参阅User-Personalization 食谱

import boto3 personalize_rec = boto3.client(service_name='personalize') personalize_rec.create_batch_inference_job ( solutionVersionArn = "Solution version ARN", jobName = "Batch job name", roleArn = "IAM service role ARN", filterArn = "Filter ARN", batchInferenceJobConfig = { # optional USER_PERSONALIZATION recipe hyperparameters "itemExplorationConfig": { "explorationWeight": "0.3", "explorationItemAgeCutOff": "30" } }, jobInput = {"s3DataSource": {"path": "s3://<name of your S3 bucket>/<folder name>/<input JSON file name>.json"}}, jobOutput = {"s3DataDestination": {"path": "s3://<name of your S3 bucket>/<output folder name>/"}} )
SDK for Java 2.x

该示例包括可选的 User-Personalization 食谱特定的 itemExplorationConfig 字段:explorationWeightexplorationItemAgeCutOff。(可选)包括 explorationWeightexplorationItemAgeCutOff 值以配置浏览。有关更多信息,请参阅User-Personalization 食谱

public static String createPersonalizeBatchInferenceJob(PersonalizeClient personalizeClient, String solutionVersionArn, String jobName, String filterArn, String s3InputDataSourcePath, String s3DataDestinationPath, String roleArn, String explorationWeight, String explorationItemAgeCutOff) { long waitInMilliseconds = 60 * 1000; String status; String batchInferenceJobArn; try { // Set up data input and output parameters. S3DataConfig inputSource = S3DataConfig.builder() .path(s3InputDataSourcePath) .build(); S3DataConfig outputDestination = S3DataConfig.builder() .path(s3DataDestinationPath) .build(); BatchInferenceJobInput jobInput = BatchInferenceJobInput.builder() .s3DataSource(inputSource) .build(); BatchInferenceJobOutput jobOutputLocation = BatchInferenceJobOutput.builder() .s3DataDestination(outputDestination) .build(); // Optional code to build the User-Personalization specific item exploration config. HashMap<String, String> explorationConfig = new HashMap<>(); explorationConfig.put("explorationWeight", explorationWeight); explorationConfig.put("explorationItemAgeCutOff", explorationItemAgeCutOff); BatchInferenceJobConfig jobConfig = BatchInferenceJobConfig.builder() .itemExplorationConfig(explorationConfig) .build(); // End optional User-Personalization recipe specific code. CreateBatchInferenceJobRequest createBatchInferenceJobRequest = CreateBatchInferenceJobRequest.builder() .solutionVersionArn(solutionVersionArn) .jobInput(jobInput) .jobOutput(jobOutputLocation) .jobName(jobName) .filterArn(filterArn) .roleArn(roleArn) .batchInferenceJobConfig(jobConfig) // Optional .build(); batchInferenceJobArn = personalizeClient.createBatchInferenceJob(createBatchInferenceJobRequest) .batchInferenceJobArn(); DescribeBatchInferenceJobRequest describeBatchInferenceJobRequest = DescribeBatchInferenceJobRequest.builder() .batchInferenceJobArn(batchInferenceJobArn) .build(); long maxTime = Instant.now().getEpochSecond() + 3 * 60 * 60; // wait until the batch inference job is complete. while (Instant.now().getEpochSecond() < maxTime) { BatchInferenceJob batchInferenceJob = personalizeClient .describeBatchInferenceJob(describeBatchInferenceJobRequest) .batchInferenceJob(); status = batchInferenceJob.status(); System.out.println("Batch inference job status: " + status); if (status.equals("ACTIVE") || status.equals("CREATE FAILED")) { break; } try { Thread.sleep(waitInMilliseconds); } catch (InterruptedException e) { System.out.println(e.getMessage()); } } return batchInferenceJobArn; } catch (PersonalizeException e) { System.out.println(e.awsErrorDetails().errorMessage()); } return ""; }
SDK for JavaScript v3
// Get service clients module and commands using ES6 syntax. import { CreateBatchInferenceJobCommand } from "@aws-sdk/client-personalize"; import { personalizeClient } from "./libs/personalizeClients.js"; // Or, create the client here. // const personalizeClient = new PersonalizeClient({ region: "REGION"}); // Set the batch inference job's parameters. export const createBatchInferenceJobParam = { jobName: 'JOB_NAME', jobInput: { /* required */ s3DataSource: { /* required */ path: 'INPUT_PATH', /* required */ // kmsKeyArn: 'INPUT_KMS_KEY_ARN' /* optional */' } }, jobOutput: { /* required */ s3DataDestination: { /* required */ path: 'OUTPUT_PATH', /* required */ // kmsKeyArn: 'OUTPUT_KMS_KEY_ARN' /* optional */' } }, roleArn: 'ROLE_ARN', /* required */ solutionVersionArn: 'SOLUTION_VERSION_ARN', /* required */ numResults: 20 /* optional integer*/ }; export const run = async () => { try { const response = await personalizeClient.send(new CreateBatchInferenceJobCommand(createBatchInferenceJobParam)); console.log("Success", response); return response; // For unit tests. } catch (err) { console.log("Error", err); } }; run();

处理批处理作业可能需要一段时间才能完成。您可以通过调用 DescribeBatchInferenceJob 和传递 batchRecommendationsJobArn 作为输入参数来检查作业的状态。您也可以通过调ListBatchInferenceJobs用列出您 AWS 环境中的所有 Amazon Personalize 批量推理作业。

创建生成主题的批量推理作业

要为相似的物品生成主题,您必须使用 Similar-Items 配方,并且您的物品数据集必须有一个文本字段和一个物品名称数据列。有关带有主题的建议的更多信息,请参阅内容生成器中带有主题的批量建议

以下代码创建了一个批量推理作业,该作业可生成带有主题的建议。将 batchInferenceJobMode 保留设置为 "THEME_GENERATION"。将 COLUMNN_NAME 替换为存储物品名称数据的列的名称。

import boto3 personalize_rec = boto3.client(service_name='personalize') personalize_rec.create_batch_inference_job ( solutionVersionArn = "Solution version ARN", jobName = "Batch job name", roleArn = "IAM service role ARN", filterArn = "Filter ARN", batchInferenceJobMode = "THEME_GENERATION", themeGenerationConfig = { "fieldsForThemeGeneration": { "itemName": "COLUMN_NAME" } }, jobInput = {"s3DataSource": {"path": "s3://<name of your S3 bucket>/<folder name>/<input JSON file name>.json"}}, jobOutput = {"s3DataDestination": {"path": "s3://<name of your S3 bucket>/<output folder name>/"}} )