本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。
创建批量推理作业 (AWS SDK)
完成为批量建议准备输入数据后,就可以通过 CreateBatchInferenceJob 操作创建批量推理作业了。
创建批量推理作业
可以使用以下代码创建批量推理作业。指定作业名称、解决方案版本的 Amazon 资源名称 (ARN),以及您在设置期间为 Amazon Personalize 创建的 IAM 服务角色的 ARN。此角色必须对您的输入和输出 Amazon S3 存储桶具有读写权限。
我们建议使用不同的输出数据位置(文件夹或其他 Amazon S3 存储桶)。对输入和输出位置使用以下语法:s3://<name of your S3 bucket>/<folder name>/<input JSON file name>.json
和 s3://<name of your S3 bucket>/<output folder name>/
。
对于 numResults
,指定您希望 Amazon Personalize 为每行输入数据预测的物品数量。(可选)提供筛选器 ARN 来筛选建议。如果您的筛选器使用占位符参数,请确保这些参数的值包含在您的输入 JSON 中。有关更多信息,请参阅 筛选批量建议和用户细分(自定义资源)。
- SDK for Python (Boto3)
-
该示例包括可选的 User-Personalization 食谱特定的 itemExplorationConfig
超参数:explorationWeight
和 explorationItemAgeCutOff
。(可选)包括 explorationWeight
和 explorationItemAgeCutOff
值以配置浏览。有关更多信息,请参阅User-Personalization 食谱。
import boto3
personalize_rec = boto3.client(service_name='personalize')
personalize_rec.create_batch_inference_job (
solutionVersionArn = "Solution version ARN
",
jobName = "Batch job name
",
roleArn = "IAM service role ARN
",
filterArn = "Filter ARN
",
batchInferenceJobConfig = {
# optional USER_PERSONALIZATION recipe hyperparameters
"itemExplorationConfig": {
"explorationWeight": "0.3
",
"explorationItemAgeCutOff": "30
"
}
},
jobInput =
{"s3DataSource": {"path": "s3://<name of your S3 bucket>/<folder name>/<input JSON file name>.json
"}},
jobOutput =
{"s3DataDestination": {"path": "s3://<name of your S3 bucket>/<output folder name>/
"}}
)
- SDK for Java 2.x
-
该示例包括可选的 User-Personalization 食谱特定的 itemExplorationConfig
字段:explorationWeight
和 explorationItemAgeCutOff
。(可选)包括 explorationWeight
和 explorationItemAgeCutOff
值以配置浏览。有关更多信息,请参阅User-Personalization 食谱。
public static String createPersonalizeBatchInferenceJob(PersonalizeClient personalizeClient,
String solutionVersionArn,
String jobName,
String filterArn,
String s3InputDataSourcePath,
String s3DataDestinationPath,
String roleArn,
String explorationWeight,
String explorationItemAgeCutOff) {
long waitInMilliseconds = 60 * 1000;
String status;
String batchInferenceJobArn;
try {
// Set up data input and output parameters.
S3DataConfig inputSource = S3DataConfig.builder()
.path(s3InputDataSourcePath)
.build();
S3DataConfig outputDestination = S3DataConfig.builder()
.path(s3DataDestinationPath)
.build();
BatchInferenceJobInput jobInput = BatchInferenceJobInput.builder()
.s3DataSource(inputSource)
.build();
BatchInferenceJobOutput jobOutputLocation = BatchInferenceJobOutput.builder()
.s3DataDestination(outputDestination)
.build();
// Optional code to build the User-Personalization specific item exploration config.
HashMap<String, String> explorationConfig = new HashMap<>();
explorationConfig.put("explorationWeight", explorationWeight);
explorationConfig.put("explorationItemAgeCutOff", explorationItemAgeCutOff);
BatchInferenceJobConfig jobConfig = BatchInferenceJobConfig.builder()
.itemExplorationConfig(explorationConfig)
.build();
// End optional User-Personalization recipe specific code.
CreateBatchInferenceJobRequest createBatchInferenceJobRequest = CreateBatchInferenceJobRequest.builder()
.solutionVersionArn(solutionVersionArn)
.jobInput(jobInput)
.jobOutput(jobOutputLocation)
.jobName(jobName)
.filterArn(filterArn)
.roleArn(roleArn)
.batchInferenceJobConfig(jobConfig) // Optional
.build();
batchInferenceJobArn = personalizeClient.createBatchInferenceJob(createBatchInferenceJobRequest)
.batchInferenceJobArn();
DescribeBatchInferenceJobRequest describeBatchInferenceJobRequest = DescribeBatchInferenceJobRequest.builder()
.batchInferenceJobArn(batchInferenceJobArn)
.build();
long maxTime = Instant.now().getEpochSecond() + 3 * 60 * 60;
// wait until the batch inference job is complete.
while (Instant.now().getEpochSecond() < maxTime) {
BatchInferenceJob batchInferenceJob = personalizeClient
.describeBatchInferenceJob(describeBatchInferenceJobRequest)
.batchInferenceJob();
status = batchInferenceJob.status();
System.out.println("Batch inference job status: " + status);
if (status.equals("ACTIVE") || status.equals("CREATE FAILED")) {
break;
}
try {
Thread.sleep(waitInMilliseconds);
} catch (InterruptedException e) {
System.out.println(e.getMessage());
}
}
return batchInferenceJobArn;
} catch (PersonalizeException e) {
System.out.println(e.awsErrorDetails().errorMessage());
}
return "";
}
- SDK for JavaScript v3
// Get service clients module and commands using ES6 syntax.
import { CreateBatchInferenceJobCommand } from
"@aws-sdk/client-personalize";
import { personalizeClient } from "./libs/personalizeClients.js";
// Or, create the client here.
// const personalizeClient = new PersonalizeClient({ region: "REGION"});
// Set the batch inference job's parameters.
export const createBatchInferenceJobParam = {
jobName: 'JOB_NAME',
jobInput: { /* required */
s3DataSource: { /* required */
path: 'INPUT_PATH', /* required */
// kmsKeyArn: 'INPUT_KMS_KEY_ARN' /* optional */'
}
},
jobOutput: { /* required */
s3DataDestination: { /* required */
path: 'OUTPUT_PATH', /* required */
// kmsKeyArn: 'OUTPUT_KMS_KEY_ARN' /* optional */'
}
},
roleArn: 'ROLE_ARN', /* required */
solutionVersionArn: 'SOLUTION_VERSION_ARN', /* required */
numResults: 20 /* optional integer*/
};
export const run = async () => {
try {
const response = await personalizeClient.send(new CreateBatchInferenceJobCommand(createBatchInferenceJobParam));
console.log("Success", response);
return response; // For unit tests.
} catch (err) {
console.log("Error", err);
}
};
run();
处理批处理作业可能需要一段时间才能完成。您可以通过调用 DescribeBatchInferenceJob 和传递 batchRecommendationsJobArn
作为输入参数来检查作业的状态。您也可以通过调ListBatchInferenceJobs用列出您 AWS 环境中的所有 Amazon Personalize 批量推理作业。
创建生成主题的批量推理作业
要为相似的物品生成主题,您必须使用 Similar-Items 配方,并且您的物品数据集必须有一个文本字段和一个物品名称数据列。有关带有主题的建议的更多信息,请参阅内容生成器中带有主题的批量建议。
以下代码创建了一个批量推理作业,该作业可生成带有主题的建议。将 batchInferenceJobMode
保留设置为 "THEME_GENERATION"
。将 COLUMNN_NAME
替换为存储物品名称数据的列的名称。
import boto3
personalize_rec = boto3.client(service_name='personalize')
personalize_rec.create_batch_inference_job (
solutionVersionArn = "Solution version ARN
",
jobName = "Batch job name
",
roleArn = "IAM service role ARN
",
filterArn = "Filter ARN
",
batchInferenceJobMode = "THEME_GENERATION",
themeGenerationConfig = {
"fieldsForThemeGeneration": {
"itemName": "COLUMN_NAME
"
}
},
jobInput =
{"s3DataSource": {"path": "s3://<name of your S3 bucket>/<folder name>/<input JSON file name>.json
"}},
jobOutput =
{"s3DataDestination": {"path": "s3://<name of your S3 bucket>/<output folder name>/
"}}
)