Creating a manifest file - Rekognition

Creating a manifest file

You can create a test or training dataset by importing a SageMaker Ground Truth format manifest file. If your images are labeled in a format that isn't a SageMaker Ground Truth manifest file, use the following information to create a SageMaker Ground Truth format manifest file.

Manifest files are in JSON lines format where each line is a complete JSON object representing the labeling information for an image. Amazon Rekognition Custom Labels supports SageMaker Ground Truth manifests with JSON lines in the following formats:

Image-level and localization (bounding-box) JSON lines can be chained together in the same manifest file.

Note

The JSON line examples in this section are formatted for readability.

When you import a manifest file, Amazon Rekognition Custom Labels applies validation rules for limits, syntax, and semantics. For more information, see Validation rules for manifest files.

The images referenced by a manifest file must be located in the same Amazon S3 bucket. The manifest file can be located in a different Amazon S3 bucket than the Amazon S3 bucket that stores the images. You specify the location of an image in the source-ref field of a JSON line.

Amazon Rekognition needs permissions to access the Amazon S3 bucket where your images are stored. If you are using the console bucket set up for you by Amazon Rekognition Custom Labels, the required permissions are already set up. If you are not using the console bucket, see Accessing external Amazon S3 Buckets.

Creating a manifest file

The following procedure creates a project with a training and test dataset. The datasets are created from training and test manifest files that you create.

To create a dataset using a SageMaker Ground Truth format manifest file (console)
  1. In the console bucket, create a folder to hold your manifest files.

  2. In the console bucket, create a folder to hold your images.

  3. Upload your images to the folder you just created.

  4. Create a SageMaker Ground Truth format manifest file for your training dataset. For more information, see Image-Level labels in manifest files and Object localization in manifest files.

    Important

    The source-ref field value in each JSON line must map to an image that you uploaded.

  5. Create an SageMaker Ground Truth format manifest file for your test dataset.

  6. Upload your manifest files to the folder that you just created.

  7. Note the location of the manifest file.

  8. Follow the instructions at Creating a dataset with a SageMaker Ground Truth manifest file (Console) to create a dataset with the uploaded manifest file. For step 8, in .manifest file location, enter the Amazon S3 URL for the location you noted in the previous step. If you are using the AWS SDK, do Creating a dataset with a SageMaker Ground Truth manifest file (SDK).