Input data

The input data are the data objects that you send to your workforce to be labeled. There are two ways to send data objects to Ground Truth for labeling:

Send a list of data objects that require labeling using an input manifest file.
Send individual data objects in real time to a perpetually running, streaming labeling job.

If you have a dataset that needs to be labeled one time, and you do not require an ongoing labeling job, create a standard labeling job using an input manifest file.

If you want to regularly send new data objects to your labeling job after it has started, create a streaming labeling job. When you create a streaming labeling job, you can optionally use an input manifest file to specify a group of data that you want labeled immediately when the job starts. You can continuously send new data objects to a streaming labeling job as long as it is active.

Note

Streaming labeling jobs are only supported through the SageMaker API. You cannot create a streaming labeling job using the SageMaker AI console.

The following task types have special input data requirements and options:

For 3D point cloud labeling job input data requirements, see 3D Point Cloud Input Data.
For video frame labeling job input data requirements, see Video Frame Input Data.

Topics

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Use input and output data

Input manifest files