Verify and Adjust Labels
When the labels on a dataset need to be validated, Amazon SageMaker Ground Truth provides functionality to have workers verify that labels are correct or to adjust previous labels.
These types of jobs fall into two distinct categories:
-
Label verification — Workers indicate if the existing labels are correct, or rate their quality, and can add comments to explain their reasoning. Workers will not be able to modify or adjust labels.
-
Label adjustment — Workers adjust prior annotations to correct them.
The following Ground Truth built-in task types support adjustment and verification labeling jobs:
-
Bounding box
-
Semantic segmentation
-
3D point cloud object detection, 3D point cloud object tracking, and 3D point cloud semantic segmentation
-
All video frame object detection and video frame object tracking task types — bounding box, polyline, polygon and keypoint
You can start a label verification and adjustment jobs using the SageMaker console or the API.
Create and Start a Label Verification Job (Console)
Bounding box and semantic segmentation label verification jobs can be created in the console. You must use the Ground Truth API to create a 3D point cloud or video frame verification job. To learn how, see the section 3D Point Cloud and Video Frame under Start a Label Verification or Adjustment Job (API).
To start a label verification job (console)
-
Open the SageMaker console: console.aws.amazon.com/sagemaker/
and choose Labeling jobs. -
Start a new labeling job by chaining a prior job or start from scratch, specifying an input manifest that contains labeled data objects.
-
In the Task type pane, select Label verification.
-
Choose Next.
-
In the Existing-labels display options pane, the system shows the available label attribute names in your manifest. Choose the label attribute name for the labeling job that you want to verify.
-
Use the instructions areas of the tool designer to provide context about what the previous labellers were asked to do and what the current verifiers need to check.
You can add new labels that workers choose from to verify labels. For example, you can ask workers to verify the image quality, and provide the labels Clear and Blurry. Workers will also have the option to add a comment to explain their selection.
-
Choose See preview to check that the tool is displaying the prior labels correctly and presents the label verification task clearly.
-
Select Create. This will create and start your labeling job.
Start an Label Adjustment Job (Console)
Use the SageMaker console to start a label verification or adjustment job.
To start a label adjustment job (console)
-
Open the SageMaker console: https://console.aws.amazon.com/sagemaker/
and choose Labeling jobs. -
Start a new labeling job by chaining a prior job or start from scratch, specifying an input manifest that contains labeled data objects.
-
Choose the same task type as the original labeling job.
-
After choosing the workers, expand Existing-labels display options. If it isn't expanded, choose the arrow next to the title to expand it.
-
Check the box next to I want to display existing labels from the dataset for this job.
-
For Label attribute name, choose the name from your manifest that corresponds to the labels that you want to display for adjustment. Ground Truth tries to detect and populate these values by analyzing the manifest, but you might need to set the correct value.
-
Use the instructions areas of the tool designer to provide context about what the previous labellers were tasked with doing and what the current verifiers need to check and adjust.
-
Choose See preview to check that the tool shows the prior labels correctly and presents the task clearly.
-
Select Create. This will create and start your labeling job.
Start a Label Verification or Adjustment Job (API)
Start a label verification or adjustment job by chaining a successfully completed
job or starting a new job from scratch using the
CreateLabelingJob
operation. The procedure is almost the
same as setting up a new labeling job with CreateLabelingJob
, with a
few modifications. Use the following sections to learn what modifications are
required to chain a labeling job to create an adjustment or verification labeling
job.
When you create an adjustment or verification labeling job using the Ground Truth
API,
you must use a different
LabelAttributeName
than the original labeling job. The original
labeling job is the job used to create the labels you want adjusted or verified.
The label category configuration file you identify for an adjustment or
verification job in LabelCategoryConfigS3Uri
of
CreateLabelingJob
must contain the same labels used in the
original labeling job. You can add new labels. For 3D point cloud and video
frame jobs, you can add new label category and frame attributes to the label
category configuration file.
Bounding Box and Semantic Segmentation
To create a bounding box or semantic segmentation label verification or
adjustment job, use the following guidelines to specify API attributes for the
CreateLabelingJob
operation.
-
Use the
LabelAttributeName
parameter to specify the output label name that you want to use for verified or adjusted labels. You must use a differentLabelAttributeName
than the one used for the original labeling job. -
If you are chaining the job, the labels from the previous labeling job to be adjusted or verified will be specified in the custom UI template. To learn how to create a custom template, see Create Custom Worker Task Template.
Identify the location of the UI template in the
UiTemplateS3Uri
parameter. SageMaker provides widgets that you can use in your custom template to display old labels. Use theinitial-value
attribute in one of the following crowd elements to extract the labels that need verification or adjustment and include them in your task template:-
crowd-semantic-segmentation—Use this crowd element in your custom UI task template to specify semantic segmentation labels that need to be verified or adjusted.
-
crowd-bounding-box—Use this crowd element in your custom UI task template to specify bounding box labels that need to be verified or adjusted.
-
-
The
LabelCategoryConfigS3Uri
parameter must contain the same label categories as the previous labeling job. -
Use the bounding box or semantic segmentation adjustment or verification lambda ARNs for
PreHumanTaskLambdaArn
andAnnotationConsolidationLambdaArn
:-
For bounding box, the adjustment labeling job lambda function ARNs end with
AdjustmentBoundingBox
and the verification lambda function ARNs end withVerificationBoundingBox
. -
For semantic segmentation, the adjustment labeling job lambda function ARNs end with
AdjustmentSemanticSegmentation
and the verification lambda function ARNs end withVerificationSemanticSegmentation
.
-
3D Point Cloud and Video Frame
-
Use the
LabelAttributeName
parameter to specify the output label name that you want to use for verified or adjusted labels. You must use a differentLabelAttributeName
than the one used for the original labeling job. -
You must use the human task UI Amazon Resource Name (ARN) (
HumanTaskUiArn
) used for the original labeling job. To see supported ARNs, seeHumanTaskUiArn
. -
In the label category configuration file, you must specify the label attribute name (
LabelAttributeName
) of the previous labeling job that you use to create the adjustment or verification labeling job in theauditLabelAttributeName
parameter. -
You specify whether your labeling job is a verification or adjustment labeling job using the
editsAllowed
parameter in your label category configuration file identified by theLabelCategoryConfigS3Uri
parameter.-
For verification labeling jobs, you must use the
editsAllowed
parameter to specify that all labels cannot be modified.editsAllowed
must be set to"none"
in each entry inlabels
. Optionally, you can specify whether or not label categories attributes and frame attributes can be adjusted by workers. -
Optionally, for adjustment labeling jobs, you can use the
editsAllowed
parameter to specify labels, label category attributes, and frame attributes that can or cannot be modified by workers. If you do not use this parameter, all labels, label category attributes, and frame attributes will be adjustable.
To learn more about the
editsAllowed
parameter and configuring your label category configuration file, see Label Category Configuration File Schema. -
-
Use the 3D point cloud or video frame adjustment lambda ARNs for
PreHumanTaskLambdaArn
andAnnotationConsolidationLambdaArn
for both adjustment and verification labeling jobs:-
For 3D point clouds, the adjustment and verification labeling job lambda function ARNs end with
Adjustment3DPointCloudSemanticSegmentation
,Adjustment3DPointCloudObjectTracking
, andAdjustment3DPointCloudObjectDetection
for 3D point cloud semantic segmentation, object detection, and object tracking respectively. -
For video frames, the adjustment and verification labeling job lambda function ARNs end with
AdjustmentVideoObjectDetection
andAdjustmentVideoObjectTracking
for video frame object detection and object tracking respectively.
-
Ground Truth stores the output data from a label verification or adjustment job in
the S3
bucket that you specified in the S3OutputPath
parameter of the CreateLabelingJob
operation. For more information about
the output data from a label verification or adjustment labeling job, see Label Verification and Adjustment Data in the
Output Manifest.
Label Verification and Adjustment Data in the Output Manifest
Amazon SageMaker Ground Truth writes label verification data to the output manifest within the metadata for the label. It adds two properties to the metadata:
-
A
type
property, with a value of "groundtruth/label-verification
. -
A
worker-feedback
property, with an array ofcomment
values. This property is added when the worker enters comments. If there are no comments, the field doesn't appear.
The following example output manifest shows how label verification data appears:
{ "source-ref":"S3 bucket location", "verify-bounding-box":"1", "verify-bounding-box-metadata": { "class-name": "bad", "confidence": 0.93, "type": "groundtruth/label-verification", "job-name": "verify-bounding-boxes", "human-annotated": "yes", "creation-date": "2018-10-18T22:18:13.527256", "worker-feedback": [ {"comment": "The bounding box on the bird is too wide on the right side."}, {"comment": "The bird on the upper right is not labeled."} ] } }
The worker output of adjustment tasks resembles the worker output of the original
task, except that it contains the adjusted values and an adjustment-status
property with the value of adjusted
or unadjusted
to indicate
whether an adjustment was made.
For more examples of the output of different tasks, see Output Data.
Cautions and Considerations
To get expected behavior when creating a label verification or adjustment job, carefully verify your input data.
-
If you are using image data, verify that your manifest file contains hexadecimal RGB color information.
-
To save money on processing costs, filter your data to ensure you are not including unwanted objects in your labeling job input manifest.
-
Add required Amazon S3 permissions to ensure your input data is processed correctly.
When you create an adjustment or verification labeling job using the Ground Truth
API, you
must use a different
LabelAttributeName
than the original labeling job.
Color Information Requirements for Semantic Segmentation Jobs
To properly reproduce color information in verification or adjustment tasks, the tool requires hexadecimal RGB color information in the manifest (for example, #FFFFFF for white). When you set up a Semantic Segmentation verification or adjustment job, the tool examines the manifest to determine if this information is present. If it can't find it,Amazon SageMaker Ground Truth displays an error message and the ends job setup.
In prior iterations of the Semantic Segmentation tool, category color information wasn't output in hexadecimal RGB format to the output manifest. That feature was introduced to the output manifest at the same time the verification and adjustment workflows were introduced. Therefore, older output manifests aren't compatible with this new workflow.
Filter Your Data Before Starting the Job
Amazon SageMaker Ground Truth processes all objects in your input manifest. If you have a partially labeled data set, you might want to create a custom manifest using an Amazon S3 Select query on your input manifest. Unlabeled objects individually fail, but they don't cause the job to fail, and they might incur processing costs. Filtering out objects you don't want verified reduces your costs.
If you create a verification job using the console, you can use the filtering tools provided there. If you create jobs using the API, make filtering your data part of your workflow where needed.