建立批次推論工作 (AWS SDK) - Amazon Personalize

本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。

建立批次推論工作 (AWS SDK)

完成之後準備批次建議的輸入資料,您就可以使用該作業建立批次推論工CreateBatchInferenceJob作。

建立批次推論工作

您可以使用下列程式碼建立批次推論工作。指定工作名稱、解決方案版本的 Amazon 資源名稱 (ARN),以及您在設定期間為 Amazon Personalize 建立的 IAM 服務角色的 ARN。此角色必須具有對您的輸入和輸出 Amazon S3 儲存貯體的讀取和寫入存取權。

我們建議您為輸出資料使用不同的位置 (資料夾或不同的 Amazon S3 儲存貯體)。輸入和輸出位置使用下列語法:s3://<name of your S3 bucket>/<folder name>/<input JSON file name>.jsons3://<name of your S3 bucket>/<output folder name>/

針對numResults,指定您希望 Amazon Personalize 針對每一行輸入資料預測的項目數。選擇性地提供篩選器 ARN 以篩選建議。如果您的篩選器使用預留位置參數,請確定參數的值包含在輸入 JSON 中。如需更多資訊,請參閱 篩選批次建議和使用者區段 (自訂資源)

SDK for Python (Boto3)

此範例包含選用的使用者個人化配方特定itemExplorationConfig超參數:explorationWeight和。explorationItemAgeCutOff可選擇包括explorationWeightexplorationItemAgeCutOff值以配置探索。如需詳細資訊,請參閱 用戶個性化配方

import boto3 personalize_rec = boto3.client(service_name='personalize') personalize_rec.create_batch_inference_job ( solutionVersionArn = "Solution version ARN", jobName = "Batch job name", roleArn = "IAM service role ARN", filterArn = "Filter ARN", batchInferenceJobConfig = { # optional USER_PERSONALIZATION recipe hyperparameters "itemExplorationConfig": { "explorationWeight": "0.3", "explorationItemAgeCutOff": "30" } }, jobInput = {"s3DataSource": {"path": "s3://<name of your S3 bucket>/<folder name>/<input JSON file name>.json"}}, jobOutput = {"s3DataDestination": {"path": "s3://<name of your S3 bucket>/<output folder name>/"}} )
SDK for Java 2.x

此範例包含選用的使用者個人化配方特定itemExplorationConfig欄位:explorationWeightexplorationItemAgeCutOff。可選擇包括explorationWeightexplorationItemAgeCutOff值以配置探索。如需詳細資訊,請參閱 用戶個性化配方

public static String createPersonalizeBatchInferenceJob(PersonalizeClient personalizeClient, String solutionVersionArn, String jobName, String filterArn, String s3InputDataSourcePath, String s3DataDestinationPath, String roleArn, String explorationWeight, String explorationItemAgeCutOff) { long waitInMilliseconds = 60 * 1000; String status; String batchInferenceJobArn; try { // Set up data input and output parameters. S3DataConfig inputSource = S3DataConfig.builder() .path(s3InputDataSourcePath) .build(); S3DataConfig outputDestination = S3DataConfig.builder() .path(s3DataDestinationPath) .build(); BatchInferenceJobInput jobInput = BatchInferenceJobInput.builder() .s3DataSource(inputSource) .build(); BatchInferenceJobOutput jobOutputLocation = BatchInferenceJobOutput.builder() .s3DataDestination(outputDestination) .build(); // Optional code to build the User-Personalization specific item exploration config. HashMap<String, String> explorationConfig = new HashMap<>(); explorationConfig.put("explorationWeight", explorationWeight); explorationConfig.put("explorationItemAgeCutOff", explorationItemAgeCutOff); BatchInferenceJobConfig jobConfig = BatchInferenceJobConfig.builder() .itemExplorationConfig(explorationConfig) .build(); // End optional User-Personalization recipe specific code. CreateBatchInferenceJobRequest createBatchInferenceJobRequest = CreateBatchInferenceJobRequest.builder() .solutionVersionArn(solutionVersionArn) .jobInput(jobInput) .jobOutput(jobOutputLocation) .jobName(jobName) .filterArn(filterArn) .roleArn(roleArn) .batchInferenceJobConfig(jobConfig) // Optional .build(); batchInferenceJobArn = personalizeClient.createBatchInferenceJob(createBatchInferenceJobRequest) .batchInferenceJobArn(); DescribeBatchInferenceJobRequest describeBatchInferenceJobRequest = DescribeBatchInferenceJobRequest.builder() .batchInferenceJobArn(batchInferenceJobArn) .build(); long maxTime = Instant.now().getEpochSecond() + 3 * 60 * 60; // wait until the batch inference job is complete. while (Instant.now().getEpochSecond() < maxTime) { BatchInferenceJob batchInferenceJob = personalizeClient .describeBatchInferenceJob(describeBatchInferenceJobRequest) .batchInferenceJob(); status = batchInferenceJob.status(); System.out.println("Batch inference job status: " + status); if (status.equals("ACTIVE") || status.equals("CREATE FAILED")) { break; } try { Thread.sleep(waitInMilliseconds); } catch (InterruptedException e) { System.out.println(e.getMessage()); } } return batchInferenceJobArn; } catch (PersonalizeException e) { System.out.println(e.awsErrorDetails().errorMessage()); } return ""; }
SDK for JavaScript v3
// Get service clients module and commands using ES6 syntax. import { CreateBatchInferenceJobCommand } from "@aws-sdk/client-personalize"; import { personalizeClient } from "./libs/personalizeClients.js"; // Or, create the client here. // const personalizeClient = new PersonalizeClient({ region: "REGION"}); // Set the batch inference job's parameters. export const createBatchInferenceJobParam = { jobName: 'JOB_NAME', jobInput: { /* required */ s3DataSource: { /* required */ path: 'INPUT_PATH', /* required */ // kmsKeyArn: 'INPUT_KMS_KEY_ARN' /* optional */' } }, jobOutput: { /* required */ s3DataDestination: { /* required */ path: 'OUTPUT_PATH', /* required */ // kmsKeyArn: 'OUTPUT_KMS_KEY_ARN' /* optional */' } }, roleArn: 'ROLE_ARN', /* required */ solutionVersionArn: 'SOLUTION_VERSION_ARN', /* required */ numResults: 20 /* optional integer*/ }; export const run = async () => { try { const response = await personalizeClient.send(new CreateBatchInferenceJobCommand(createBatchInferenceJobParam)); console.log("Success", response); return response; // For unit tests. } catch (err) { console.log("Error", err); } }; run();

處理批次任務可能需要一段時間才能完成。您可以藉由呼叫 DescribeBatchInferenceJob 並傳遞 batchRecommendationsJobArn 作為輸入參數來檢查任務的狀態。您也可以撥打電話ListBatchInferenceJobs列出 AWS 環境中的所有 Amazon Personalize 批次推論任務。

建立可產生主題的批次推論工作

若要為類似項目產生主題,您必須使用類似項目方法,而您的項目資料集必須具有文字欄位和項目名稱資料欄。如需有關包含主題之建議的更多資訊,請參閱包含內容生成器主題的 Batch 推薦

下列程式碼會建立批次推論工作,產生包含主題的建議。將batchInferenceJobMode設定保留為"THEME_GENERATION"。取代COLUMNN_NAME為儲存物品名稱資料的欄名稱。

import boto3 personalize_rec = boto3.client(service_name='personalize') personalize_rec.create_batch_inference_job ( solutionVersionArn = "Solution version ARN", jobName = "Batch job name", roleArn = "IAM service role ARN", filterArn = "Filter ARN", batchInferenceJobMode = "THEME_GENERATION", themeGenerationConfig = { "fieldsForThemeGeneration": { "itemName": "COLUMN_NAME" } }, jobInput = {"s3DataSource": {"path": "s3://<name of your S3 bucket>/<folder name>/<input JSON file name>.json"}}, jobOutput = {"s3DataDestination": {"path": "s3://<name of your S3 bucket>/<output folder name>/"}} )