为批量建议准备输入数据 - Amazon Personalize

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

为批量建议准备输入数据

批量推理作业可从 Amazon S3 存储桶导入您的批量输入 JSON 数据,使用您的自定义解决方案版本生成建议,然后将物品建议导出到 Amazon S3 存储桶。在获取批量建议之前,您必须准备好 JSON 文件并将其上传到 Amazon S3 存储桶。我们建议您在 Amazon S3 存储桶中创建输出文件夹,或者使用单独的输出 Amazon S3 存储桶。然后,您可以使用相同的输入数据位置运行多个批量推理作业。

如果您使用带有占位符参数的筛选器(如 $GENRE),则必须在输入 JSON 的 filterValues 对象中提供参数的值。有关更多信息,请参阅 在您的输入 JSON 中提供筛选器值

准备和导入数据
  1. 根据您的食谱设置批量输入数据的格式。您无法通过 Trending-Now 食谱获得批量建议。

    • 对于 USER_PERSONALIZATION 食谱和 Popularity-Count 食谱,您的输入数据是包含 userId 列表的 JSON 文件

    • 对于 RELATED_ITEMS 食谱,您的输入数据是 itemID 列表

    • 对于 PERSONALIZED_RANKING 食谱,您的输入数据是 userId 列表,每个用户 ID 都与一组 itemId 配对

    用新行分隔每行。有关输入数据的示例,请参阅批量推理作业输入和输出 JSON 示例

  2. 将您的输入 JSON 上传到 Amazon S3 存储桶中的输入文件夹。有关更多信息,请参阅《Amazon Simple Storage Service 用户指南》中的使用拖放功能上传文件和文件夹

  3. 为输出数据创建一个单独的位置,可以是文件夹,也可以是其他 Amazon S3 存储桶。通过为输出 JSON 创建单独的位置,您可以使用相同的输入数据位置运行多个批量推理作业。

  4. 创建批量推理作业。Amazon Personalize 会将解决方案版本的建议输出到输出数据位置。

批量推理作业输入和输出 JSON 示例

如何将输入数据格式设置为您使用的食谱的格式。如果您使用带有占位符参数的筛选器(如 $GENRE),则必须在输入 JSON 的 filterValues 对象中提供参数的值。有关更多信息,请参阅 在您的输入 JSON 中提供筛选器值

以下各节列出了格式正确的 JSON 批量推理作业输入和输出示例。您无法通过 Trending-Now 食谱获得批量建议。

USER_PERSONALIZATION 食谱

以下显示了 USER_PERSONALIZATION 配方格式正确的 JSON 输入和输出示例。如果您使用 Userpersonalization-v2,则每件推荐的商品都包含一份将该商品包含在推荐中的原因列表。此列表可以为空。有关可能原因的信息,请参阅推荐理由(用户个性化-v2)

Input

用新行分隔每个 userId,如下所示。

{"userId": "4638"} {"userId": "663"} {"userId": "3384"} ...
Output
{"input":{"userId":"4638"},"output":{"recommendedItems":["63992","115149","110102","148626","148888","31685","102445","69526","92535","143355","62374","7451","56171","122882","66097","91542","142488","139385","40583","71530","39292","111360","34048","47099","135137"],"scores":[0.0152238,0.0069081,0.0068222,0.006394,0.0059746,0.0055851,0.0049357,0.0044644,0.0042968,0.004015,0.0038805,0.0037476,0.0036563,0.0036178,0.00341,0.0033467,0.0033258,0.0032454,0.0032076,0.0031996,0.0029558,0.0029021,0.0029007,0.0028837,0.0028316]},"error":null} {"input":{"userId":"663"},"output":{"recommendedItems":["368","377","25","780","1610","648","1270","6","165","1196","1097","300","1183","608","104","474","736","293","141","2987","1265","2716","223","733","2028"],"scores":[0.0406197,0.0372557,0.0254077,0.0151975,0.014991,0.0127175,0.0124547,0.0116712,0.0091098,0.0085492,0.0079035,0.0078995,0.0075598,0.0074876,0.0072006,0.0071775,0.0068923,0.0066552,0.0066232,0.0062504,0.0062386,0.0061121,0.0060942,0.0060781,0.0059263]},"error":null} {"input":{"userId":"3384"},"output":{"recommendedItems":["597","21","223","2144","208","2424","594","595","920","104","520","367","2081","39","1035","2054","160","1370","48","1092","158","2671","500","474","1907"],"scores":[0.0241061,0.0119394,0.0118012,0.010662,0.0086972,0.0079428,0.0073218,0.0071438,0.0069602,0.0056961,0.0055999,0.005577,0.0054387,0.0051787,0.0051412,0.0050493,0.0047126,0.0045393,0.0042159,0.0042098,0.004205,0.0042029,0.0040778,0.0038897,0.0038809]},"error":null} ...

下面显示了 Popularity-Count 配方的格式正确的 JSON 输入和输出示例。您无法通过 Trending-Now 食谱获得批量建议。

Input

用新行分隔每个 userId,如下所示。

{"userId": "12"} {"userId": "105"} {"userId": "41"} ...
Output
{"input": {"userId": "12"}, "output": {"recommendedItems": ["105", "106", "441"]}} {"input": {"userId": "105"}, "output": {"recommendedItems": ["105", "106", "441"]}} {"input": {"userId": "41"}, "output": {"recommendedItems": ["105", "106", "441"]}} ...

PERSONALIZED_RANKING 食谱

以下显示了 PERSONALIZED_RANKING 配方的格式正确的 JSON 输入和输出示例。

Input

用新行分隔每个 userId 和要排名的 itemIds 列表,如下所示。

{"userId": "891", "itemList": ["27", "886", "101"]} {"userId": "445", "itemList": ["527", "55", "901"]} {"userId": "71", "itemList": ["27", "351", "101"]} ...
Output
{"input":{"userId":"891","itemList":["27","886","101"]},"output":{"recommendedItems":["27","101","886"],"scores":[0.48421,0.28133,0.23446]}} {"input":{"userId":"445","itemList":["527","55","901"]},"output":{"recommendedItems":["901","527","55"],"scores":[0.46972,0.31011,0.22017]}} {"input":{"userId":"71","itemList":["29","351","199"]},"output":{"recommendedItems":["351","29","199"],"scores":[0.68937,0.24829,0.06232]}} ...

下面显示了 RELATED_ITEMS 配方的格式正确的 JSON 输入和输出示例。

Input

用新行分隔每个 itemId,如下所示。

{"itemId": "105"} {"itemId": "106"} {"itemId": "441"} ...
Output
{"input": {"itemId": "105"}, "output": {"recommendedItems": ["106", "107", "49"]}} {"input": {"itemId": "106"}, "output": {"recommendedItems": ["105", "107", "49"]}} {"input": {"itemId": "441"}, "output": {"recommendedItems": ["2", "442", "435"]}} ...

下面显示了带有主题的 Similar-Items 配方的格式正确的 JSON 输入和输出示例。

Input

用新行分隔每个 itemId,如下所示。

{"itemId": "40"} {"itemId": "43"} ...
Output
{"input":{"itemId":"40"},"output":{"recommendedItems":["36","50","44","22","21","29","3","1","2","39"],"theme":"Movies with a strong female lead","itemsThemeRelevanceScores":[0.19994527,0.183059963,0.17478035,0.1618133,0.1574806,0.15468733,0.1499242,0.14353688,0.13531424,0.10291852]}} {"input":{"itemId":"43"},"output":{"recommendedItems":["50","21","36","3","17","2","39","1","10","5"],"theme":"The best movies of 1995","itemsThemeRelevanceScores":[0.184988,0.1795761,0.11143453,0.0989443,0.08258403,0.07952615,0.07115086,0.0621634,-0.138913,-0.188913]}} ...