Amazon Simple Storage Service
Developer Guide (API Version 2006-03-01)

The Basics: Amazon S3 Batch Operations Jobs

Sign up for the Preview

To create a job, you give Amazon S3 batch operations a list of objects and select the action to perform on those objects. Amazon S3 batch operations support the following operations:

The objects that you want a job to act on are listed in a manifest object. A job performs the specified operation on each object that is included in its manifest. You can use an Amazon S3 Inventory report as a manifest, which makes it easy to create large lists of objects located in a bucket. You can also specify a manifest in a simple CSV format that enables you to perform batch operations on a customized list of objects contained within a single bucket.

After you create a job, Amazon S3 processes the list of objects in the manifest and executes the specified operation against each object. While a job is executing, you can monitor its progress programmatically or through the Amazon S3 console. You can also configure a job to generate a completion report when it finishes. The completion report describes the results of each task that was executed by the job. For more information about monitoring jobs, see Managing Batch Operations Jobs.

Specifying a Manifest

A manifest is an Amazon S3 object that lists object keys that you want Amazon S3 to act upon. To specify a manifest for a job, you specify the manifest object key, ETag, and optional version ID. You can specify a manifest in a create job request using one of the following two formats.

  • Amazon S3 inventory report — must be a CSV-formatted Amazon S3 inventory report. You must specify the manifest.json file that is associated with the inventory report. For more information about inventory reports, see Amazon S3 Inventory. If the inventory report includes version IDs, Amazon S3 batch operations operate on the specific object versions.

  • CSV file — Each row in the file must include the bucket name, object key, and optionally, the object version. You either specify version IDs for all objects or skip it. For more information about the CSV manifest format, see JobManifestSpec in the Amazon Simple Storage Service API Reference.

    The following is an example:

    Examplebucket, objectkey1, PZ9ibn9D5lP6p298B7S9_ceqx1n5EJ0p Examplebucket, objectkey2, PZ9ibn9D5lP6p298B7S9_ceqx1n5EJ0p Examplebucket, objectkey3, jbo9_jhdPEyB4RrmOxWS0kU0EoNrU_oI

Important

If the objects in your manifest are in a versioned bucket, you should specify the version IDs for the objects. When you create a job, Amazon S3 batch operations parses the entire manifest before running the job. However, it doesn't take a "snapshot" of the state of the bucket.

Because manifests can contain billions of objects, jobs might take a long time to run. If you overwrite an object with a new version while a job is running, and you didn't specify a version ID for that object, Amazon S3 performs the operation on the latest version of the object, and not the version that existed when you created the job. The only way to avoid this behavior is to specify version IDs for the objects that are listed in the manifest.

On this page: