本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。
检测视频中的标签
Amazon Rekognition Video 可以检测视频中的标签(对象和概念)以及检测到标签的时间。有关开发工具包代码示例,请参阅使用 Java 或 Python 分析存储在 Amazon S3 存储桶中的视频 (SDK)。有关 AWS CLI 示例,请参阅使用分析视频 AWS Command Line Interface。
Amazon Rekognition Video 标签检测是一项异步操作。要开始检测视频中的标签,请致电StartLabelDetection。
Amazon Rekognition Video 会将视频分析的完成状态发布到 Amazon Simple Notification Service 主题。如果视频分析成功,请调用 GetLabelDetection 来获取检测到的标签。有关调用视频分析 API 操作的信息,请参阅调用 Amazon Rekognition Video 操作。
StartLabelDetection请求
以下示例是 StartLabelDetection
操作的请求。您为 StartLabelDetection
操作提供了存储在 Amazon S3 存储桶中的视频。在示例请求 JSON 中,指定了 Amazon S3 存储桶和视频名称,以及 MinConfidence
、Features
、Settings
和 NotificationChannel
。
MinConfidence
是 Amazon Rekognition Video 对检测到的标签或实例边界框(如果检测到)要在响应中返回而对其准确度所具有的最小置信度。
使用 Features
,您可以指定要将 GENERAL_LABELS 作为响应的一部分返回。
使用 Settings
,您可以筛选 GENERAL_LABELS 的返回项目。对于标签,您可以使用纳入和排除筛选器。您也可以按特定标签、单个标签或标签类别进行筛选:
-
LabelInclusionFilters
– 用于指定要在响应中纳入哪些标签 -
LabelExclusionFilters
– 用于指定要从响应中排除哪些标签。 -
LabelCategoryInclusionFilters
– 用于指定要在响应中纳入哪些标签类别。 -
LabelCategoryExclusionFilters
- 用于指定要从响应中排除哪些标签类别。
您还可以根据需要组合纳入和排除筛选器,排除某些标签或类别,而纳入其他标签或类别。
NotificationChannel
是您希望 Amazon Rekognition Video 向其发布标签检测操作完成状态的 Amazon SNS 主题的 ARN。如果您使用的是 AmazonRekognitionServiceRole
权限策略,那么 Amazon SNS 主题的主题名称必须以 Rekognition 开头。
以下是 JSON 格式的示例 StartLabelDetection
请求,包括筛选器:
{ "ClientRequestToken": "5a6e690e-c750-460a-9d59-c992e0ec8638", "JobTag": "5a6e690e-c750-460a-9d59-c992e0ec8638", "Video": { "S3Object": { "Bucket": "bucket", "Name": "video.mp4" } }, "Features": ["GENERAL_LABELS"], "MinConfidence": 75, "Settings": { "GeneralLabels": { "LabelInclusionFilters": ["Cat", "Dog"], "LabelExclusionFilters": ["Tiger"], "LabelCategoryInclusionFilters": ["Animals and Pets"], "LabelCategoryExclusionFilters": ["Popular Landmark"] } }, "NotificationChannel": { "RoleArn": "arn:aws:iam::012345678910:role/SNSAccessRole", "SNSTopicArn": "arn:aws:sns:us-east-1:012345678910:notification-topic", } }
GetLabelDetection 操作响应
GetLabelDetection
将返回一个数组 (Labels
),其中包含有关在视频中检测到的标签的信息。数组可以按时间排序,也可以按指定 SortBy
参数时检测到的标签进行排序。也可以使用 AggregateBy
参数选择如何汇总响应项。
以下示例是 GetLabelDetection
的 JSON 响应。在响应中,请注意以下内容:
-
排序顺序 – 返回的标签数组按时间进行排序。要按标签进行排序,请为
GetLabelDetection
在SortBy
输入参数中指定NAME
。如果此标签在视频中多次出现,则会有 (LabelDetection) 元素的多个实例。默认排序顺序为TIMESTAMP
,而辅助排序顺序为NAME
。 -
标签信息 –
LabelDetection
数组元素包含一个(标签)对象,该对象包含标签名称和 Amazon Rekognition 在检测到的标签的准确性中具有的置信度。Label
对象还包括标签的分层分类和常见标签的边界框信息。Timestamp
是从视频开始到检测到标签的时间,以毫秒为单位。还会返回与标签关联的任何类别或别名的相关信息。对于按视频
SEGMENTS
汇总的结果,将返回StartTimestampMillis
、EndTimestampMillis
和DurationMillis
结构,它们分别定义了片段的开始时间、结束时间和持续时间。 -
汇总 – 指定返回结果时的汇总方式。默认为按
TIMESTAMPS
汇总。您也可以选择按SEGMENTS
汇总,即在某个时间段内汇总结果。如果按SEGMENTS
汇总,则不会返回有关检测到的带有边界框的实例的信息。只返回在分段期间检测到的标签。 -
分页信息 - 此示例显示一页标签检测信息。您可以为
GetLabelDetection
在MaxResults
输入参数中指定要返回的LabelDetection
对象的数量。如果存在的结果的数量超过了MaxResults
,则GetLabelDetection
会返回一个令牌 (NextToken
),用于获取下一页的结果。有关更多信息,请参阅 获取 Amazon Rekognition Video 分析结果。 -
视频信息 – 此响应包含有关由
GetLabelDetection
返回的每页信息中的视频格式(VideoMetadata
)的信息。
以下是 JSON 格式的示例 GetLabelDetection 响应,其中包含由 TIMESTAMPS 进行聚合:
{ "JobStatus": "SUCCEEDED", "LabelModelVersion": "3.0", "Labels": [ { "Timestamp": 1000, "Label": { "Name": "Car", "Categories": [ { "Name": "Vehicles and Automotive" } ], "Aliases": [ { "Name": "Automobile" } ], "Parents": [ { "Name": "Vehicle" } ], "Confidence": 99.9364013671875, // Classification confidence "Instances": [ { "BoundingBox": { "Width": 0.26779675483703613, "Height": 0.8562285900115967, "Left": 0.3604024350643158, "Top": 0.09245597571134567 }, "Confidence": 99.9364013671875 // Detection confidence } ] } }, { "Timestamp": 1000, "Label": { "Name": "Cup", "Categories": [ { "Name": "Kitchen and Dining" } ], "Aliases": [ { "Name": "Mug" } ], "Parents": [], "Confidence": 99.9364013671875, // Classification confidence "Instances": [ { "BoundingBox": { "Width": 0.26779675483703613, "Height": 0.8562285900115967, "Left": 0.3604024350643158, "Top": 0.09245597571134567 }, "Confidence": 99.9364013671875 // Detection confidence } ] } }, { "Timestamp": 2000, "Label": { "Name": "Kangaroo", "Categories": [ { "Name": "Animals and Pets" } ], "Aliases": [ { "Name": "Wallaby" } ], "Parents": [ { "Name": "Mammal" } ], "Confidence": 99.9364013671875, "Instances": [ { "BoundingBox": { "Width": 0.26779675483703613, "Height": 0.8562285900115967, "Left": 0.3604024350643158, "Top": 0.09245597571134567, }, "Confidence": 99.9364013671875 } ] } }, { "Timestamp": 4000, "Label": { "Name": "Bicycle", "Categories": [ { "Name": "Hobbies and Interests" } ], "Aliases": [ { "Name": "Bike" } ], "Parents": [ { "Name": "Vehicle" } ], "Confidence": 99.9364013671875, "Instances": [ { "BoundingBox": { "Width": 0.26779675483703613, "Height": 0.8562285900115967, "Left": 0.3604024350643158, "Top": 0.09245597571134567 }, "Confidence": 99.9364013671875 } ] } } ], "VideoMetadata": { "ColorRange": "FULL", "DurationMillis": 5000, "Format": "MP4", "FrameWidth": 1280, "FrameHeight": 720, "FrameRate": 24 } }
以下是 JSON 格式的示例 GetLabelDetection 响应,其中包含按分段进行聚合:
{ "JobStatus": "SUCCEEDED", "LabelModelVersion": "3.0", "Labels": [ { "StartTimestampMillis": 225, "EndTimestampMillis": 3578, "DurationMillis": 3353, "Label": { "Name": "Car", "Categories": [ { "Name": "Vehicles and Automotive" } ], "Aliases": [ { "Name": "Automobile" } ], "Parents": [ { "Name": "Vehicle" } ], "Confidence": 99.9364013671875 // Maximum confidence score for Segment mode } }, { "StartTimestampMillis": 7578, "EndTimestampMillis": 12371, "DurationMillis": 4793, "Label": { "Name": "Kangaroo", "Categories": [ { "Name": "Animals and Pets" } ], "Aliases": [ { "Name": "Wallaby" } ], "Parents": [ { "Name": "Mammal" } ], "Confidence": 99.9364013671875 } }, { "StartTimestampMillis": 22225, "EndTimestampMillis": 22578, "DurationMillis": 2353, "Label": { "Name": "Bicycle", "Categories": [ { "Name": "Hobbies and Interests" } ], "Aliases": [ { "Name": "Bike" } ], "Parents": [ { "Name": "Vehicle" } ], "Confidence": 99.9364013671875 } } ], "VideoMetadata": { "ColorRange": "FULL", "DurationMillis": 5000, "Format": "MP4", "FrameWidth": 1280, "FrameHeight": 720, "FrameRate": 24 } }
转变回 GetLabelDetection 应
使用 GetLabelDetection API 操作检索结果时,您可能需要响应结构来模仿旧的 API 响应结构,其中主标签和别名都包含在同一个列表中。
上一节中的 JSON 响应示例,显示了来自的 API 响应的当前形式 GetLabelDetection。
以下示例显示了 GetLabelDetection API 之前的响应:
{ "Labels": [ { "Timestamp": 0, "Label": { "Instances": [], "Confidence": 60.51791763305664, "Parents": [], "Name": "Leaf" } }, { "Timestamp": 0, "Label": { "Instances": [], "Confidence": 99.53411102294922, "Parents": [], "Name": "Human" } }, { "Timestamp": 0, "Label": { "Instances": [ { "BoundingBox": { "Width": 0.11109819263219833, "Top": 0.08098889887332916, "Left": 0.8881205320358276, "Height": 0.9073750972747803 }, "Confidence": 99.5831298828125 }, { "BoundingBox": { "Width": 0.1268676072359085, "Top": 0.14018426835536957, "Left": 0.0003282368124928324, "Height": 0.7993982434272766 }, "Confidence": 99.46029663085938 } ], "Confidence": 99.63411102294922, "Parents": [], "Name": "Person" } }, . . . { "Timestamp": 166, "Label": { "Instances": [], "Confidence": 73.6471176147461, "Parents": [ { "Name": "Clothing" } ], "Name": "Sleeve" } } ], "LabelModelVersion": "2.0", "JobStatus": "SUCCEEDED", "VideoMetadata": { "Format": "QuickTime / MOV", "FrameRate": 23.976024627685547, "Codec": "h264", "DurationMillis": 5005, "FrameHeight": 674, "FrameWidth": 1280 } }
如果需要,您可以转换当前响应以遵循旧响应的格式。您可以使用以下示例代码将最新的 API 响应转换为之前的 API 响应结构:
from copy import deepcopy VIDEO_LABEL_KEY = "Labels" LABEL_KEY = "Label" ALIASES_KEY = "Aliases" INSTANCE_KEY = "Instances" NAME_KEY = "Name" #Latest API response sample for AggregatedBy SEGMENTS EXAMPLE_SEGMENT_OUTPUT = { "Labels": [ { "Timestamp": 0, "Label":{ "Name": "Person", "Confidence": 97.530106, "Parents": [], "Aliases": [ { "Name": "Human" }, ], "Categories": [ { "Name": "Person Description" } ], }, "StartTimestampMillis": 0, "EndTimestampMillis": 500666, "DurationMillis": 500666 }, { "Timestamp": 6400, "Label": { "Name": "Leaf", "Confidence": 89.77790069580078, "Parents": [ { "Name": "Plant" } ], "Aliases": [], "Categories": [ { "Name": "Plants and Flowers" } ], }, "StartTimestampMillis": 6400, "EndTimestampMillis": 8200, "DurationMillis": 1800 }, ] } #Output example after the transformation for AggregatedBy SEGMENTS EXPECTED_EXPANDED_SEGMENT_OUTPUT = { "Labels": [ { "Timestamp": 0, "Label":{ "Name": "Person", "Confidence": 97.530106, "Parents": [], "Aliases": [ { "Name": "Human" }, ], "Categories": [ { "Name": "Person Description" } ], }, "StartTimestampMillis": 0, "EndTimestampMillis": 500666, "DurationMillis": 500666 }, { "Timestamp": 6400, "Label": { "Name": "Leaf", "Confidence": 89.77790069580078, "Parents": [ { "Name": "Plant" } ], "Aliases": [], "Categories": [ { "Name": "Plants and Flowers" } ], }, "StartTimestampMillis": 6400, "EndTimestampMillis": 8200, "DurationMillis": 1800 }, { "Timestamp": 0, "Label":{ "Name": "Human", "Confidence": 97.530106, "Parents": [], "Categories": [ { "Name": "Person Description" } ], }, "StartTimestampMillis": 0, "EndTimestampMillis": 500666, "DurationMillis": 500666 }, ] } #Latest API response sample for AggregatedBy TIMESTAMPS EXAMPLE_TIMESTAMP_OUTPUT = { "Labels": [ { "Timestamp": 0, "Label": { "Name": "Person", "Confidence": 97.530106, "Instances": [ { "BoundingBox": { "Height": 0.1549897, "Width": 0.07747964, "Top": 0.50858885, "Left": 0.00018205095 }, "Confidence": 97.530106 }, ], "Parents": [], "Aliases": [ { "Name": "Human" }, ], "Categories": [ { "Name": "Person Description" } ], }, }, { "Timestamp": 6400, "Label": { "Name": "Leaf", "Confidence": 89.77790069580078, "Instances": [], "Parents": [ { "Name": "Plant" } ], "Aliases": [], "Categories": [ { "Name": "Plants and Flowers" } ], }, }, ] } #Output example after the transformation for AggregatedBy TIMESTAMPS EXPECTED_EXPANDED_TIMESTAMP_OUTPUT = { "Labels": [ { "Timestamp": 0, "Label": { "Name": "Person", "Confidence": 97.530106, "Instances": [ { "BoundingBox": { "Height": 0.1549897, "Width": 0.07747964, "Top": 0.50858885, "Left": 0.00018205095 }, "Confidence": 97.530106 }, ], "Parents": [], "Aliases": [ { "Name": "Human" }, ], "Categories": [ { "Name": "Person Description" } ], }, }, { "Timestamp": 6400, "Label": { "Name": "Leaf", "Confidence": 89.77790069580078, "Instances": [], "Parents": [ { "Name": "Plant" } ], "Aliases": [], "Categories": [ { "Name": "Plants and Flowers" } ], }, }, { "Timestamp": 0, "Label": { "Name": "Human", "Confidence": 97.530106, "Parents": [], "Categories": [ { "Name": "Person Description" } ], }, }, ] } def expand_aliases(inferenceOutputsWithAliases): if VIDEO_LABEL_KEY in inferenceOutputsWithAliases: expandInferenceOutputs = [] for segmentLabelDict in inferenceOutputsWithAliases[VIDEO_LABEL_KEY]: primaryLabelDict = segmentLabelDict[LABEL_KEY] if ALIASES_KEY in primaryLabelDict: for alias in primaryLabelDict[ALIASES_KEY]: aliasLabelDict = deepcopy(segmentLabelDict) aliasLabelDict[LABEL_KEY][NAME_KEY] = alias[NAME_KEY] del aliasLabelDict[LABEL_KEY][ALIASES_KEY] if INSTANCE_KEY in aliasLabelDict[LABEL_KEY]: del aliasLabelDict[LABEL_KEY][INSTANCE_KEY] expandInferenceOutputs.append(aliasLabelDict) inferenceOutputsWithAliases[VIDEO_LABEL_KEY].extend(expandInferenceOutputs) return inferenceOutputsWithAliases if __name__ == "__main__": segmentOutputWithExpandAliases = expand_aliases(EXAMPLE_SEGMENT_OUTPUT) assert segmentOutputWithExpandAliases == EXPECTED_EXPANDED_SEGMENT_OUTPUT timestampOutputWithExpandAliases = expand_aliases(EXAMPLE_TIMESTAMP_OUTPUT) assert timestampOutputWithExpandAliases == EXPECTED_EXPANDED_TIMESTAMP_OUTPUT