偵測影片中的標籤 - Amazon Rekognition

本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。

偵測影片中的標籤

Amazon Rekognition Video 可以在影片中偵測標籤 (物件和概念),以及偵測到標籤的時間。如需開發套件程式碼範例,請參閱 使用 Java 或 Python (SDK) 分析儲存於 Amazon S3 儲存貯體中的影片。如需範 AWS CLI 例,請參閱使用分析視訊 AWS Command Line Interface

Amazon Rekognition Video 標籤偵測是一種非同步操作。若要開始偵測視訊中的標籤,請呼叫StartLabel偵測

Amazon Rekognition Video 向 Amazon Simple Notification Service 主題發佈影片的完成狀態。如果視頻分析成功,請調用「檢測」以獲取GetLabel檢測到的標籤。如需呼叫影片分析 API 操作的資訊,請參閱 呼叫 Amazon Rekognition Video 操作

StartLabel檢測請求

以下是 StartLabelDetection 操作要求的範例。您可以使用儲存在 Amazon S3 儲存貯體中的影片來提供 StartLabelDetection 操作。在範例請求 JSON 中,會指定 Amazon S3 儲存貯體和影片名稱以及 MinConfidenceFeaturesSettingsNotificationChannel

MinConfidence 是 Amazon Rekognition Video 在偵測到的標籤的準確性中必須具有的最低可信度或執行個體的週框方塊 (如偵測到),以便在回應中傳回。

透過 Features,您可以指定要將 GENERAL_LABELS 作為回應的一部分傳回。

使用 Settings,您可以篩選 GENERAL_LABELS 傳回的專案。對於標籤,您可以使用包容性和獨家篩選器。您還可以按特定標籤,單個標籤或按標籤類別進行篩選:

  • LabelInclusionFilters:用於指定要包含在回應中的標籤

  • LabelExclusionFilters:用於指定要從回應中排除的標籤。

  • LabelCategoryInclusionFilters:用於指定要包含在回應中的標籤類別。

  • LabelCategoryExclusionFilters:用於指定要從回應中排除的標籤類別。

您還可以根據需要組合包含性和排斥性篩選,但不包括某些標籤或類別以及包括其他標籤或類別。

NotificationChannel 是您希望 Amazon Rekognition Video 將標籤偵測操作的完成狀態發佈到 Amazon SNS 主題的 ARN。如果您使用的是 AmazonRekognitionServiceRole 許可政策,則 Amazon SNS 主題必須具有以 Rekognition 開頭的主題名稱。

以下是 JSON 格式的範例 StartLabelDetection 要求,包括篩選條件:

{ "ClientRequestToken": "5a6e690e-c750-460a-9d59-c992e0ec8638", "JobTag": "5a6e690e-c750-460a-9d59-c992e0ec8638", "Video": { "S3Object": { "Bucket": "bucket", "Name": "video.mp4" } }, "Features": ["GENERAL_LABELS"], "MinConfidence": 75, "Settings": { "GeneralLabels": { "LabelInclusionFilters": ["Cat", "Dog"], "LabelExclusionFilters": ["Tiger"], "LabelCategoryInclusionFilters": ["Animals and Pets"], "LabelCategoryExclusionFilters": ["Popular Landmark"] } }, "NotificationChannel": { "RoleArn": "arn:aws:iam::012345678910:role/SNSAccessRole", "SNSTopicArn": "arn:aws:sns:us-east-1:012345678910:notification-topic", } }

GetLabelDetection 作業回應

GetLabelDetection 會傳回陣列 (Labels),其中包含影片中偵測到之標籤的相關資訊。陣列可依時間或指定 SortBy 參數時偵測到的標籤來排序。您也可以使用 AggregateBy 參數來選取回應專案的彙總方式。

以下是 GetLabelDetection 的 JSON 回應範例。在回應中,請注意下列事項:

  • 排序順序:傳回的標籤陣列會依時間排序。若要依標籤排序,請在 SortBy 輸入參數中指定 NAME 以執行 GetLabelDetection。如果標籤在視頻中出現多次,則會出現(LabelDetection)元素的倍數實例。預設排序或是 TIMESTAMP,而次要排序順序為 NAME

  • 標籤資訊LabelDetection 陣列元素包含 (標籤) 物件,其中包括標籤名稱和 Amazon Rekognition 標籤偵測精確度的可信度分數。Label 物件也包含標籤的階層式分類法和常用標籤的週框方塊資訊。Timestamp 則是偵測到標籤的時間,從影片開始起算並以毫秒為單位。

    也會傳回與標籤相關聯之任何類別或別名的相關資訊。對於依影片 SEGMENTS 彙總的結果,會傳回 StartTimestampMillisEndTimestampMillisDurationMillis 的結構,分別定義區段的開始時間、結束時間和持續時間。

  • 彙總:指定傳回時如何彙總結果。依據 TIMESTAMPS 彙總預設值。您也可以選擇依據 SEGMENTS 彙總,以便在時間範圍內彙總結果。如果依據 SEGMENTS 彙總,則不會傳回偵測到具有邊界方框之執行個體的相關資訊。僅傳回區段期間偵測到的標籤。

  • 分頁資訊:範例顯示標籤偵測資訊的一頁。您可以指定 GetLabelDetectionMaxResults 輸入參數中要傳回幾個 LabelDetection 物件。如果結果數目超過 MaxResultsGetLabelDetection 會傳回用來取得下一頁結果的字符 (NextToken)。如需詳細資訊,請參閱 取得 Amazon Rekognition Video 分析結果

  • 影片資訊:回應包含 GetLabelDetection 所傳回之每頁資訊中影片格式 (VideoMetadata) 的相關資訊。

以下是 JSON 格式的範例 GetLabelDetection 回應,其中包含時間戳記彙總:

{ "JobStatus": "SUCCEEDED", "LabelModelVersion": "3.0", "Labels": [ { "Timestamp": 1000, "Label": { "Name": "Car", "Categories": [ { "Name": "Vehicles and Automotive" } ], "Aliases": [ { "Name": "Automobile" } ], "Parents": [ { "Name": "Vehicle" } ], "Confidence": 99.9364013671875, // Classification confidence "Instances": [ { "BoundingBox": { "Width": 0.26779675483703613, "Height": 0.8562285900115967, "Left": 0.3604024350643158, "Top": 0.09245597571134567 }, "Confidence": 99.9364013671875 // Detection confidence } ] } }, { "Timestamp": 1000, "Label": { "Name": "Cup", "Categories": [ { "Name": "Kitchen and Dining" } ], "Aliases": [ { "Name": "Mug" } ], "Parents": [], "Confidence": 99.9364013671875, // Classification confidence "Instances": [ { "BoundingBox": { "Width": 0.26779675483703613, "Height": 0.8562285900115967, "Left": 0.3604024350643158, "Top": 0.09245597571134567 }, "Confidence": 99.9364013671875 // Detection confidence } ] } }, { "Timestamp": 2000, "Label": { "Name": "Kangaroo", "Categories": [ { "Name": "Animals and Pets" } ], "Aliases": [ { "Name": "Wallaby" } ], "Parents": [ { "Name": "Mammal" } ], "Confidence": 99.9364013671875, "Instances": [ { "BoundingBox": { "Width": 0.26779675483703613, "Height": 0.8562285900115967, "Left": 0.3604024350643158, "Top": 0.09245597571134567, }, "Confidence": 99.9364013671875 } ] } }, { "Timestamp": 4000, "Label": { "Name": "Bicycle", "Categories": [ { "Name": "Hobbies and Interests" } ], "Aliases": [ { "Name": "Bike" } ], "Parents": [ { "Name": "Vehicle" } ], "Confidence": 99.9364013671875, "Instances": [ { "BoundingBox": { "Width": 0.26779675483703613, "Height": 0.8562285900115967, "Left": 0.3604024350643158, "Top": 0.09245597571134567 }, "Confidence": 99.9364013671875 } ] } } ], "VideoMetadata": { "ColorRange": "FULL", "DurationMillis": 5000, "Format": "MP4", "FrameWidth": 1280, "FrameHeight": 720, "FrameRate": 24 } }

以下是 JSON 格式的範例 GetLabelDetection 回應,其中包含按區段彙總:

{ "JobStatus": "SUCCEEDED", "LabelModelVersion": "3.0", "Labels": [ { "StartTimestampMillis": 225, "EndTimestampMillis": 3578, "DurationMillis": 3353, "Label": { "Name": "Car", "Categories": [ { "Name": "Vehicles and Automotive" } ], "Aliases": [ { "Name": "Automobile" } ], "Parents": [ { "Name": "Vehicle" } ], "Confidence": 99.9364013671875 // Maximum confidence score for Segment mode } }, { "StartTimestampMillis": 7578, "EndTimestampMillis": 12371, "DurationMillis": 4793, "Label": { "Name": "Kangaroo", "Categories": [ { "Name": "Animals and Pets" } ], "Aliases": [ { "Name": "Wallaby" } ], "Parents": [ { "Name": "Mammal" } ], "Confidence": 99.9364013671875 } }, { "StartTimestampMillis": 22225, "EndTimestampMillis": 22578, "DurationMillis": 2353, "Label": { "Name": "Bicycle", "Categories": [ { "Name": "Hobbies and Interests" } ], "Aliases": [ { "Name": "Bike" } ], "Parents": [ { "Name": "Vehicle" } ], "Confidence": 99.9364013671875 } } ], "VideoMetadata": { "ColorRange": "FULL", "DurationMillis": 5000, "Format": "MP4", "FrameWidth": 1280, "FrameHeight": 720, "FrameRate": 24 } }

轉換響 GetLabelDetection 應

使用 GetLabelDetection API 作業擷取結果時,您可能需要回應結構來模擬較舊的 API 回應結構,其中主要標籤和別名都包含在相同的清單中。

上一節中找到的 JSON 回應範例會顯示 API 回應的目前格式 GetLabelDetection。

下面的例子顯示了來自 GetLabelDetection API 的先前響應:

{ "Labels": [ { "Timestamp": 0, "Label": { "Instances": [], "Confidence": 60.51791763305664, "Parents": [], "Name": "Leaf" } }, { "Timestamp": 0, "Label": { "Instances": [], "Confidence": 99.53411102294922, "Parents": [], "Name": "Human" } }, { "Timestamp": 0, "Label": { "Instances": [ { "BoundingBox": { "Width": 0.11109819263219833, "Top": 0.08098889887332916, "Left": 0.8881205320358276, "Height": 0.9073750972747803 }, "Confidence": 99.5831298828125 }, { "BoundingBox": { "Width": 0.1268676072359085, "Top": 0.14018426835536957, "Left": 0.0003282368124928324, "Height": 0.7993982434272766 }, "Confidence": 99.46029663085938 } ], "Confidence": 99.63411102294922, "Parents": [], "Name": "Person" } }, . . . { "Timestamp": 166, "Label": { "Instances": [], "Confidence": 73.6471176147461, "Parents": [ { "Name": "Clothing" } ], "Name": "Sleeve" } } ], "LabelModelVersion": "2.0", "JobStatus": "SUCCEEDED", "VideoMetadata": { "Format": "QuickTime / MOV", "FrameRate": 23.976024627685547, "Codec": "h264", "DurationMillis": 5005, "FrameHeight": 674, "FrameWidth": 1280 } }

如果需要,您可以將當前回應轉換為遵循舊回應的格式。您可以使用下列範例程式碼,將最新的 API 回應轉換為先前的 API 回應結構:

from copy import deepcopy VIDEO_LABEL_KEY = "Labels" LABEL_KEY = "Label" ALIASES_KEY = "Aliases" INSTANCE_KEY = "Instances" NAME_KEY = "Name" #Latest API response sample for AggregatedBy SEGMENTS EXAMPLE_SEGMENT_OUTPUT = { "Labels": [ { "Timestamp": 0, "Label":{ "Name": "Person", "Confidence": 97.530106, "Parents": [], "Aliases": [ { "Name": "Human" }, ], "Categories": [ { "Name": "Person Description" } ], }, "StartTimestampMillis": 0, "EndTimestampMillis": 500666, "DurationMillis": 500666 }, { "Timestamp": 6400, "Label": { "Name": "Leaf", "Confidence": 89.77790069580078, "Parents": [ { "Name": "Plant" } ], "Aliases": [], "Categories": [ { "Name": "Plants and Flowers" } ], }, "StartTimestampMillis": 6400, "EndTimestampMillis": 8200, "DurationMillis": 1800 }, ] } #Output example after the transformation for AggregatedBy SEGMENTS EXPECTED_EXPANDED_SEGMENT_OUTPUT = { "Labels": [ { "Timestamp": 0, "Label":{ "Name": "Person", "Confidence": 97.530106, "Parents": [], "Aliases": [ { "Name": "Human" }, ], "Categories": [ { "Name": "Person Description" } ], }, "StartTimestampMillis": 0, "EndTimestampMillis": 500666, "DurationMillis": 500666 }, { "Timestamp": 6400, "Label": { "Name": "Leaf", "Confidence": 89.77790069580078, "Parents": [ { "Name": "Plant" } ], "Aliases": [], "Categories": [ { "Name": "Plants and Flowers" } ], }, "StartTimestampMillis": 6400, "EndTimestampMillis": 8200, "DurationMillis": 1800 }, { "Timestamp": 0, "Label":{ "Name": "Human", "Confidence": 97.530106, "Parents": [], "Categories": [ { "Name": "Person Description" } ], }, "StartTimestampMillis": 0, "EndTimestampMillis": 500666, "DurationMillis": 500666 }, ] } #Latest API response sample for AggregatedBy TIMESTAMPS EXAMPLE_TIMESTAMP_OUTPUT = { "Labels": [ { "Timestamp": 0, "Label": { "Name": "Person", "Confidence": 97.530106, "Instances": [ { "BoundingBox": { "Height": 0.1549897, "Width": 0.07747964, "Top": 0.50858885, "Left": 0.00018205095 }, "Confidence": 97.530106 }, ], "Parents": [], "Aliases": [ { "Name": "Human" }, ], "Categories": [ { "Name": "Person Description" } ], }, }, { "Timestamp": 6400, "Label": { "Name": "Leaf", "Confidence": 89.77790069580078, "Instances": [], "Parents": [ { "Name": "Plant" } ], "Aliases": [], "Categories": [ { "Name": "Plants and Flowers" } ], }, }, ] } #Output example after the transformation for AggregatedBy TIMESTAMPS EXPECTED_EXPANDED_TIMESTAMP_OUTPUT = { "Labels": [ { "Timestamp": 0, "Label": { "Name": "Person", "Confidence": 97.530106, "Instances": [ { "BoundingBox": { "Height": 0.1549897, "Width": 0.07747964, "Top": 0.50858885, "Left": 0.00018205095 }, "Confidence": 97.530106 }, ], "Parents": [], "Aliases": [ { "Name": "Human" }, ], "Categories": [ { "Name": "Person Description" } ], }, }, { "Timestamp": 6400, "Label": { "Name": "Leaf", "Confidence": 89.77790069580078, "Instances": [], "Parents": [ { "Name": "Plant" } ], "Aliases": [], "Categories": [ { "Name": "Plants and Flowers" } ], }, }, { "Timestamp": 0, "Label": { "Name": "Human", "Confidence": 97.530106, "Parents": [], "Categories": [ { "Name": "Person Description" } ], }, }, ] } def expand_aliases(inferenceOutputsWithAliases): if VIDEO_LABEL_KEY in inferenceOutputsWithAliases: expandInferenceOutputs = [] for segmentLabelDict in inferenceOutputsWithAliases[VIDEO_LABEL_KEY]: primaryLabelDict = segmentLabelDict[LABEL_KEY] if ALIASES_KEY in primaryLabelDict: for alias in primaryLabelDict[ALIASES_KEY]: aliasLabelDict = deepcopy(segmentLabelDict) aliasLabelDict[LABEL_KEY][NAME_KEY] = alias[NAME_KEY] del aliasLabelDict[LABEL_KEY][ALIASES_KEY] if INSTANCE_KEY in aliasLabelDict[LABEL_KEY]: del aliasLabelDict[LABEL_KEY][INSTANCE_KEY] expandInferenceOutputs.append(aliasLabelDict) inferenceOutputsWithAliases[VIDEO_LABEL_KEY].extend(expandInferenceOutputs) return inferenceOutputsWithAliases if __name__ == "__main__": segmentOutputWithExpandAliases = expand_aliases(EXAMPLE_SEGMENT_OUTPUT) assert segmentOutputWithExpandAliases == EXPECTED_EXPANDED_SEGMENT_OUTPUT timestampOutputWithExpandAliases = expand_aliases(EXAMPLE_TIMESTAMP_OUTPUT) assert timestampOutputWithExpandAliases == EXPECTED_EXPANDED_TIMESTAMP_OUTPUT