Amazon SageMaker
Developer Guide

The AWS Documentation website is getting a new look!
Try it now and let us know what you think. Switch to the new look >>

You can return to the original look by selecting English in the language selector above.

Label verification and adjustment

When the labels on a dataset need to be validated, Ground Truth provides functionality to have workers to verify that labels are correct or to adjust previous labels.

These types of jobs fall into two distinct categories:

  • Label verification — workers indicate if the existing labels are correct, or rate quality, and can add comments to explain their reasoning.

  • Label adjustment — workers adjust prior annotations to correct them.

Start a label verification job in the console

  1. Start a new labeling job by chaining a prior job or start from scratch, specifying an input manifest that contains labeled data objects.

  2. Choose the Label verification task type and continue to the next screen.

  3. In the Display existing labels pane, the system will detect and populate the available label attribute names in your manifest. Select the label attribute name for the prior labeling job you want to verify.

  4. Use the instructions areas of the tool designer to provide context about what the previous labelers were asked to do and what the current verifiers need to check.

  5. Use the See preview option to check that the tool is displaying the prior labels correctly and presents the label verification task clearly.

Start an adjustment job in the console

  1. Start a new labeling job by chaining a prior job or start from scratch, specifying an input manifest that contains labeled data objects.

  2. Choose the correct task type for your data and continue to the next screen.

  3. After selecting the workers, there is an optional configuration section to Display existing labels. If it is not expanded, click the arrow next to the title to expand it.

  4. Check the box next to I want to display existing labels from the dataset for this job.

  5. Select the Label attribute name name from your manifest that corresponds to the labels you want to display for adjustment. The system will try to detect and populate these values by analyzing the manifest, but you may need to set the correct value.

  6. Use the instructions areas of the tool designer to provide context about what the previous labelers were tasked with doing and what the current verifiers need to check and adjust.

  7. Use the See preview option to check that the tool is displaying the prior labels correctly and presents the task clearly.

Label verification and adjustment data in the output manifest

Label verification data is written to the output manifest within the metadata for the label. Two properties are added to the metadata:

  • A type property with a value of "groundtruth/label-verification.

  • A worker-feedback property with an array of comment values. This is only added when the worker enters comments. If there are no comments, the field will not appear.

{ "source-ref":"S3 bucket location", "verify-bounding-box":"1", "verify-bounding-box-metadata": { "class-name": "bad", "confidence": 0.93, "type": "groundtruth/label-verification", "job-name": "verify-bounding-boxes", "human-annotated": "yes", "creation-date": "2018-10-18T22:18:13.527256", "worker-feedback": [ {"comment": "The bounding box on the bird is too wide on the right side."}, {"comment": "The bird on the upper right is not labeled."} ] } }

In adjustment tasks, the worker output resembles the worker output of the original task, except it will contain the adjusted values and an adjustment-status property with the value of adjusted or unadjusted to indicate whether an adjustment was made.

See the Output Data page for more examples of the output of different tasks.

Cautions and considerations

Color information requirements for Semantic Segmentation jobs

To properly reproduce color information in verification or adjustment tasks, the tool requires hexadecimal RGB color information in the manifest (e.g. #FFFFFF for white). During the set-up of a Semantic Segmentation verification or adjustment job, the tool will examine the manifest to determine if this information is present. If it cannot find it, an error message is shown and the job set-up cannot be completed.

In prior iterations of the Semantic Segmentation tool, category color information was not output in hexadecimal RGB format to the output manifest. That feature was introduced to the output manifest at the same time the verification and adjustment workflows were introduced. Therefore older output manifests are not compatible with this new workflow.

Filtering your data before starting the job

Amazon SageMaker Ground Truth will process all objects in your input manifest. If you have a partially labeled data set you may want to create a custom manifest using an Amazon S3 Select query on your input manifest. Unlabeled objects will individually fail, but not cause the job to fail, and may incur processing costs. Filtering out objects you don't want verified will also reduce your costs.

There are some filtering tools provided in the console when creating a verification job. If you are creating jobs using the API, make filtering your data part of your workflow where needed.

Security considerations for images

Due to browser security models, some image markup tasks like keypoints, polygons, bounding boxes, and semantic segmentation will require a CORS specification to be added to the Amazon S3 bucket where you store the images. This is necessary to apply prior markup to the images.

Applying CORS to your bucket

  1. Open the Amazon S3 console at https://console.aws.amazon.com/s3/.

  2. Select the bucket in which you are storing your images.

  3. Select the Permissions tab, then CORS configuration.

  4. Add the following block of XML and save.

    <?xml version="1.0" encoding="UTF-8"?> <CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/"> <CORSRule> <AllowedOrigin>*</AllowedOrigin> <AllowedMethod>GET</AllowedMethod> </CORSRule> </CORSConfiguration>