Tutorial: Transferring data from Amazon S3 to Amazon S3 across AWS accounts - AWS DataSync

Tutorial: Transferring data from Amazon S3 to Amazon S3 across AWS accounts

With AWS DataSync, you can move data between Amazon S3 buckets that belong to different AWS accounts.

Important

Copying data across AWS accounts using the methods in this tutorial works only with Amazon S3. Additionally, this tutorial can help you transfer data between S3 buckets that are also in different AWS Regions (unless you're working with one or more opt-in Regions).

Overview

It's not uncommon to need to transfer data between different AWS accounts, especially if you have separate teams managing your organization's resources. Here's what a cross-account transfer using DataSync can look like:

  • Source account: The AWS account for managing the S3 bucket that you need to transfer data from.

  • Destination account: The AWS account for managing the S3 bucket that you need to transfer data to.

Transfers across accounts

The following diagram illustrates a scenario where you transfer data from an S3 bucket to another S3 bucket that's in a different AWS account.


                            An example DataSync scenario of data moving from an S3 bucket in
                                one AWS account (your source account) before making it into an S3
                                bucket in a different AWS account (your destination
                                account).
Transfers across accounts and Regions

The following diagram illustrates a scenario where you transfer data from an S3 bucket to another S3 bucket that's in a different AWS account and Region.


                            An example DataSync scenario of data moving from an S3 bucket in
                                one AWS account (your source account) and Region before making it
                                into an S3 bucket in a different AWS account (your destination
                                account) and Region.

Required permissions

Before you begin, make sure that your source and destination AWS accounts have the right permissions to complete a cross-account transfer between S3 buckets.

Required permissions for your source account

For your source AWS account, there are two sets of permissions to consider for this kind of cross-account transfer. One set of permissions is for the user who works with DataSync to create and start the transfer task (for example, your storage administrator). The other set of permissions allows the DataSync service to transfer objects to the S3 bucket in your destination account on your behalf.

User permissions

At minimum, you need the following permissions in your source account to use DataSync while going through this tutorial:

  • datasync:CancelTaskExecution

  • datasync:CreateLocationS3

  • datasync:CreateTask

  • datasync:DescribeLocation*

  • datasync:DescribeTask

  • datasync:DescribeTaskExecution

  • datasync:ListLocations

  • datasync:ListTasks

  • datasync:ListTaskExecutions

  • datasync:StartTaskExecution

  • iam:AttachRolePolicy

  • iam:CreateRole

  • iam:CreatePolicy

  • iam:ListRoles

  • iam:PassRole

  • s3:GetBucketLocation

  • s3:ListAllMyBuckets

  • s3:ListBucket

Tip

For user permissions, consider using AWSDataSyncFullAccess, an AWS managed policy that provides full access to DataSync and minimal access to its dependencies. This managed policy also provides transfer task logging by default.

DataSync permissions

DataSync needs permission to write data to the S3 bucket in your destination account on your behalf. In your source account, you'll create an AWS Identity and Access Management (IAM) role that can do this. You'll then specify this role when creating your DataSync destination location.

Required permissions for your destination account

For your destination AWS account, you need permission to disable your S3 bucket's access control lists (ACLs) and update the bucket's policy. For more information on these specific permissions, see the Amazon S3 User Guide.

Step 1: In your source account, create a DataSync source location

In your source account, create a DataSync location for the S3 bucket that you're transferring data from.

If you're creating the location by using the DataSync console, you can let DataSync automatically create and assume the IAM role needed to access your source S3 bucket.

Step 2: In your source account, create an IAM role for DataSync

In your source account, you need an IAM role that gives DataSync permission to write to the S3 bucket in your destination account on your behalf.

Normally, when you create a transfer location for an S3 bucket in the DataSync console, DataSync can automatically create and assume a role that has the right permissions to write to that bucket. Since you're transferring across accounts, however, you must create the role manually.

Create the IAM role

Create an IAM role with DataSync as the trusted entity.

To create the IAM role
  1. Log in to the AWS Management Console with your source account.

  2. Open the IAM console at https://console.aws.amazon.com/iam/.

  3. In the left navigation pane, under Access management, choose Roles, and then choose Create role.

  4. On the Select trusted entity page, for Trusted entity type, choose AWS service.

  5. For Use case, choose DataSync in the dropdown list and select DataSync. Choose Next.

  6. On the Add permissions page, choose Next.

  7. Give your role a name and choose Create role.

For more information, see Creating a role for an AWS service (console) in the IAM User Guide.

Attach a custom policy to the IAM role

The IAM role that you just created needs a policy that allows DataSync to write to the S3 bucket in your destination account.

To attach a custom policy to your IAM role
  1. On the Roles page of the IAM console, search for the role that you just created and choose its name.

  2. On the role's details page, choose the Permissions tab. Choose Add permissions then Create inline policy.

  3. Choose the JSON tab and do the following:

    1. Paste the following JSON into the policy editor:

      { "Version": "2012-10-17", "Statement": [ { "Action": [ "s3:GetBucketLocation", "s3:ListBucket", "s3:ListBucketMultipartUploads" ], "Effect": "Allow", "Resource": "arn:aws:s3:::destination-bucket" }, { "Action": [ "s3:AbortMultipartUpload", "s3:DeleteObject", "s3:GetObject", "s3:ListMultipartUploadParts", "s3:PutObject", "s3:GetObjectTagging", "s3:PutObjectTagging" ], "Effect": "Allow", "Resource": "arn:aws:s3:::destination-bucket/*" } ] }
    2. Replace each instance of destination-bucket with the name of the S3 bucket in your destination account.

  4. Choose Next. Give your policy a name and choose Create policy.

Step 3: In your destination account, disable ACLs for your S3 bucket

It's important that all the data that you transfer to the S3 bucket belongs to your destination account. To ensure that this account owns the data, disable the bucket's access control lists (ACLs).

To disable ACLs for an S3 bucket
  1. In the AWS Management Console, switch over to your destination account.

  2. Open the Amazon S3 console at https://console.aws.amazon.com/s3/.

  3. In the left navigation pane, choose Buckets.

  4. In the Buckets list, choose the S3 bucket that you're transferring data to.

  5. On the bucket's detail page, choose the Permissions tab.

  6. Under Object Ownership, choose Edit.

  7. If it isn't already selected, choose the ACLs disabled (recommended) option.

  8. Choose Save changes.

For more information, see Controlling ownership of objects and disabling ACLs for your bucket in the Amazon S3 User Guide.

Step 4: In your destination account, update your S3 bucket policy

In your destination account, modify the destination S3 bucket policy to include the DataSync IAM role that you created in your source account.

The updated bucket policy (provided to you in the following instructions) includes two principals:

To update the destination S3 bucket policy
  1. While still logged in to the S3 console with your destination account, choose the S3 bucket that you're transferring data to.

  2. On the bucket's detail page, choose the Permissions tab.

  3. Under Bucket policy, choose Edit and do the following to modify your S3 bucket policy:

    1. Update what's in the editor to include the following policy statements:

      { "Version": "2008-10-17", "Statement": [ { "Sid": "DataSyncCreateS3LocationAndTaskAccess", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::source-account:role/source-datasync-role" }, "Action": [ "s3:GetBucketLocation", "s3:ListBucket", "s3:ListBucketMultipartUploads", "s3:AbortMultipartUpload", "s3:DeleteObject", "s3:GetObject", "s3:ListMultipartUploadParts", "s3:PutObject", "s3:GetObjectTagging", "s3:PutObjectTagging" ], "Resource": [ "arn:aws:s3:::destination-bucket", "arn:aws:s3:::destination-bucket/*" ] }, { "Sid": "DataSyncCreateS3Location", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::source-account:role/source-user-role" }, "Action": "s3:ListBucket", "Resource": "arn:aws:s3:::destination-bucket" } ] }
    2. Replace each instance of source-account with the AWS account ID for your source account.

    3. Replace source-datasync-role with the IAM role that you created for DataSync in your source account.

    4. Replace each instance of destination-bucket with the name of the S3 bucket in your destination account.

    5. Replace source-user-role with the IAM role that includes the required user permissions to use DataSync.

  4. Choose Save changes.

Step 5: In your source account, create a DataSync destination location

In your source account, you need to create a DataSync location for the S3 bucket in your destination account.

The DataSync console won't let you create locations for storage resources in another AWS account. However, you can do this by using AWS CloudShell, a browser-based, pre-authenticated shell that you launch directly from the console. CloudShell allows you to run the AWS CLI commands for completing this tutorial without downloading or installing command line tools.

Note

If you want to complete the following steps by using a command line tool other than CloudShell, make sure your AWS CLI profile uses the same source-user-role that you specified in the destination S3 bucket policy. For more information, see the AWS Command Line Interface User Guide.

To create a DataSync destination location by using CloudShell
  1. In the AWS Management Console, switch back to your source account.

  2. Open the AWS DataSync console at https://console.aws.amazon.com/datasync/.

  3. Do one of the following to launch CloudShell:

    • Choose the CloudShell icon on the console navigation bar. It's located to the right of the search box.

    • Use the search box on the console navigation bar to search for CloudShell and then choose the CloudShell option.

  4. Copy the following command:

    aws datasync create-location-s3 \ --s3-bucket-arn arn:aws:s3:::destination-bucket \ --s3-config '{ "BucketAccessRoleArn":"arn:aws:iam::source-user-account:role/source-datasync-role" }'
  5. Replace destination-bucket with the name of the S3 bucket in your destination account.

  6. Replace source-user-account with the AWS account ID for your source account.

  7. Replace source-datasync-role with the DataSync IAM role that you created in your source account.

  8. If your destination bucket is in a different Region than your source bucket, add the --region option to the command to specify the Region where the destination bucket resides. For example, --region us-east-2.

  9. Run the command in CloudShell.

    If the command returns a DataSync location ARN similar to this, you successfully created the location:

    { "LocationArn": "arn:aws:datasync:us-east-2:123456789012:location/loc-abcdef01234567890" }
  10. In the left navigation pane, expand Data transfer, then choose Locations.

  11. If you created the location in a different Region, choose that Region in the navigation pane.

From your source account, you can see the location of the S3 bucket in the destination account that you just created.

Step 6: In your source account, create and start your DataSync transfer task

Before you move your data, let's recap what you've done so far:

  • In your source account, you created an IAM role that allows DataSync to write data to the S3 bucket in your destination account.

  • In your destination account, you configured your S3 bucket so that DataSync can access the bucket and write data to it.

  • In your source account, you created the DataSync source and destination locations for your transfer.

To create and start the DataSync transfer task
  1. While still using the DataSync console in your source account, expand Data transfer in the left navigation pane, then choose Tasks and Create task.

  2. If the bucket in your destination account is in a different Region than the bucket in your source account, choose the destination bucket's Region in the top navigation pane.

    Important

    To avoid a network connection error, you must create your DataSync task in the same Region as the destination location.

  3. On the Configure source location page, do the following:

    1. Select Choose an existing location.

    2. (For transfers across Regions) In the Region dropdown, choose the Region where the source bucket resides.

    3. For Existing locations, choose the source location for the S3 bucket that you're transferring data from, then choose Next.

  4. On the Configure destination location page, do the following:

    1. Select Choose an existing location.

    2. For Existing locations, choose the destination location for the S3 bucket that you're transferring data to, then choose Next.

  5. On the Configure settings page, give the task a name. As needed, configure additional settings, such as specifying an Amazon CloudWatch log group. Choose Next.

  6. On the Review page, review your settings and choose Create task.

  7. On the task's details page, choose Start, and then choose one of the following:

    • To run the task without modification, choose Start with defaults.

    • To modify the task before running it, choose Start with overriding options.

When your task finishes, check the S3 bucket in your destination account. You should see the data that moved from your source account bucket.

Troubleshooting

Refer to the following information if you run into issues trying to complete your cross-account transfer.

Permissions errors

When setting up a cross-account transfer with Amazon S3, you might see permissions errors. For example, here's a common permissions error when trying to create an S3 destination location:

An error occurred (InvalidRequestException) when calling the CreateLocationS3 operation: DataSync location access test failed: could not perform s3:HeadBucket on bucket DOC-EXAMPLE-DESTINATION-BUCKET. Access denied. Ensure bucket access role has s3:ListBucket permission.

This error means that your source AWS account user permissions are missing the s3:ListBucket permission. These permissions are for the user who creates and starts DataSync tasks. Add s3:ListBucket to your user permissions and try again to create the destination location.

Connection errors

When transferring between S3 buckets in different AWS accounts and Regions, you might get a network connection error when starting your DataSync task. To resolve this, create a task in the same Region as your destination location and try running that task.

Related resources

For more information about what you did in this tutorial, see the following topics: