Creating a manifest file
You can create a test or training dataset by importing a SageMaker Ground Truth format manifest file. If your images are labeled in a format that isn't a SageMaker Ground Truth manifest file, use the following information to create a SageMaker Ground Truth format manifest file.
Manifest files are in JSON lines
-
Classification Job Output – Use to add image-level labels to an image. An image-level label defines the class of scene, concept, or object (if object location information isn't needed) that's on an image. An image can have more that one image-level label. For more information, see Image-Level labels in manifest files.
-
Bounding Box Job Output – Use to label the class and location of one or more objects on an image. For more information, see Object localization in manifest files.
Image-level and localization (bounding-box) JSON lines can be chained together in the same manifest file.
The JSON line examples in this section are formatted for readability.
When you import a manifest file, Amazon Rekognition Custom Labels applies validation rules for limits, syntax, and semantics. For more information, see Validation rules for manifest files.
The images referenced by a manifest file must be located in the same Amazon S3 bucket. The
manifest file can be located in a different Amazon S3 bucket than the Amazon S3 bucket that stores
the images. You specify the location of an image in the source-ref
field of
a JSON line.
Amazon Rekognition needs permissions to access the Amazon S3 bucket where your images are stored. If you are using the console bucket set up for you by Amazon Rekognition Custom Labels, the required permissions are already set up. If you are not using the console bucket, see Accessing external Amazon S3 Buckets.
Creating a dataset with a manifest file
The following procedure creates a project with a training and test dataset. The datasets are created from training and test manifest files that you create.
To create a dataset using a SageMaker Ground Truth format manifest file (console)
-
In the console bucket, create a folder to hold your manifest files.
-
In the console bucket, create a folder to hold your images.
-
Upload your images to the folder you just created.
-
Create a SageMaker Ground Truth format manifest file for your training dataset. For more information, see Image-Level labels in manifest files and Object localization in manifest files.
Important The
source-ref
field value in each JSON line must map to an image that you uploaded. -
Create an SageMaker Ground Truth format manifest file for your test dataset.
-
Upload your manifest files to the folder that you just created.
Sign in to the AWS Management Console and open the Amazon Rekognition console at https://console.aws.amazon.com/rekognition/
. -
In the left pane, choose Use Custom Labels. The Amazon Rekognition Custom Labels landing page is shown.
-
The Amazon Rekognition Custom Labels landing page, choose Get started.
-
In the left pane, Choose Projects.
-
Choose Create Project.
-
In Project name, enter a name for your project.
-
Choose Create project to create your project.
-
Choose Create dataset. The Create dataset page is shown.
-
In Starting configuration, choose Start with a training dataset and a test dataset.
-
In the Training dataset details section, choose Import images labeled by SageMaker Ground Truth.
-
In .manifest file location enter the Amazon S3 location of the training manifest that you uploaded in step 6.
-
In the Test dataset details section, choose Import images labeled by SageMaker Ground Truth.
-
In .manifest file location enter the Amazon S3 location of the test manifest file that uploaded in step 6.
-
Choose Create Datasets.
-
Train the model. For more information, see Training an Amazon Rekognition Custom Labels model.
Creating a dataset with a manifest file (SDK)
To create a dataset with a manifest file, use the CreateDataset
API.
To create a dataset with a manifest file(SDK)
-
Use the following example code to create the dataset.