Replicating existing objects with S3 Batch Replication - Amazon Simple Storage Service

Replicating existing objects with S3 Batch Replication

S3 Batch Replication provides you a way to replicate objects that existed before a replication configuration was in place, objects that have previously been replicated, and objects that have failed replication. This is done through the use of a Batch Operations job. This differs from live replication which continuously and automatically replicates new objects across Amazon S3 buckets. To get started with Batch Replication you may:

  • Initiate Batch Replication for a new replication rule or destination – You may create a one-time Batch Replication job when creating the first rule in a new replication configuration or when adding a new destination to an existing configuration through the AWS Management Console.

  • Initiate Batch Replication for an existing replication configuration – You can create a new Batch Replication job using S3 Batch Operations through the AWS SDKs, AWS Command Line Interface (AWS CLI), or the Amazon S3 console.

When the Batch Replication job finishes, you receive a completion report. For more information about how to use the report to examine the job, see Tracking job status and completion reports.

S3 Batch Replication considerations

  • Your source bucket must have an existing replication configuration. To enable replication, see Setting up replication and Walkthroughs: Examples for configuring replication.

  • If you have S3 Lifecycle configured for your bucket, we recommend disabling your Lifecycle rules while the Batch Replication job is active. This will ensure parity between the source and destination buckets. Otherwise these buckets could diverge and the destination bucket will not be an exact replica of the source bucket. Consider the following:

    • Your source bucket has multiple versions on an object and a delete marker.

    • Your source and destination buckets have a Lifecycle policy to remove expired delete markers.

    Batch Replication may replicate the delete marker to the destination bucket before replicating the object versions. This could result in the delete marker being marked as expired and being removed from the destination bucket before the objects are copied.

  • The AWS Identity and Access Management (IAM) role that you specify to run the Batch Operations job must have permissions to perform the underlying Batch Replication operation. For more information about creating IAM roles, see Configuring IAM policies for Batch Replication.

  • Batch Replication requires a manifest which can be generated by Amazon S3. The generated manifest must be stored in the same AWS Region as the source bucket. If you choose to not generate the manifest you may supply a Amazon S3 Inventory report or CSV file that contains the objects you wish to replicate.

  • Batch Replication does not support re-replicating objects that were deleted with the version ID of the object from the destination bucket. To re-replicate these objects you can copy the source objects in place with a Batch Copy job. Copying those objects in place will create new versions of the object in the source bucket and initiate replication automatically to the destination. Deleting and recreating the destination bucket will not initiate replication.

    For more information on Batch Copy, see, Examples that use Batch Operations to copy objects.

Specifying a manifest for a Batch Replication job

A manifest is an Amazon S3 object that contains object keys that you want Amazon S3 to act upon. If you wish to create a Batch Replication job you must supply either a user-generated manifest or have Amazon S3 generate a manifest based on your replication configuration.

If you supply a user-generated manifest it must be in the form of a Amazon S3 Inventory report or CSV file. If the objects in your manifest are in a versioned bucket, you must specify the version IDs for the objects. Only the object with the version ID specified in the manifest will be replicated. To learn more about specifying a manifest, see Specifying a manifest.

If you choose to have Amazon S3 generate a manifest file on your behalf the objects listed will use the same source bucket, prefix, and tags as your replication configuration. With a generated manifest Amazon S3 will replicate all eligible versions of your objects.

Note

If you choose to have the manifest generated it must be stored in the same AWS Region as the source bucket.

Filters for a Batch Replication job

When creating your Batch Replication job you can optionally specify additional filters, such as object creation date and replication status to reduce the scope of the job.

You can filter objects to replicate based on the ObjectReplicationStatuses value, by providing one or more of the following values:

  • "NONE" – Indicates that Amazon S3 has never attempted to replicate the object before.

  • "FAILED" – Indicates that Amazon S3 has attempted, but failed to replicate the object before.

  • "COMPLETED" – Indicates that Amazon S3 has successfully replicated the object before.

  • "REPLICA" – Indicates that this is a replica object that Amazon S3 replicated from another source.

For more information about replication statuses, see Getting replication status information.

If you do not filter based on replication status Batch Operations will attempt to replicate everything eligible. Depending on your goal, you might set ObjectReplicationStatuses to one of the following values:

  • If you want to replicate only existing objects that have never been replicated, only include "NONE".

  • If you want to retry replicating only objects that previously failed to replicate, only include "FAILED".

  • If you want to both replicate existing objects and retry replicating objects that previously failed to replicate, include both "NONE" and "FAILED".

  • If you want to back-fill a destination bucket with objects that have been replicated to another destination, include "COMPLETED".

  • If you want replicate objects previously replicated, include "REPLICA".

Batch Replication completion report

When you create a Batch Replication job, you can request a CVS completion report. This report shows objects, replication success or failure codes, outputs, and descriptions. For more information about job tracking and completion reports see, Completion reports.

For a list of Replication failure codes and descriptions see, Amazon S3 replication failure reasons.