Tutorial: Transferring data from on-premises storage to Amazon S3 across AWS accounts
When using AWS DataSync with on-premises storage, you typically copy data to an AWS storage service that belongs to the same AWS account as your DataSync agent. There are situations, however, where you might need to transfer data to an Amazon S3 bucket that's associated with a different account.
Important
Copying data across AWS accounts by using the methods in this tutorial works only when Amazon S3 is one of the DataSync transfer locations.
Overview
It's not uncommon to need to transfer data between different AWS accounts, especially if you have separate teams managing your organization's resources. Here's what a cross-account transfer using DataSync can look like:
-
Source account: The AWS account for managing network resources. This is the account that you'll activate your DataSync agent with.
-
Destination account: The AWS account for managing the S3 bucket that you need to transfer data to.
The following diagram illustrates this kind of scenario.
Required permissions
Before you begin, make sure that your source and destination AWS accounts have the right permissions to complete a cross-account transfer to an S3 bucket.
Required permissions for your source account
For your source AWS account, there are two sets of permissions to consider for this kind of cross-account transfer. One set of permissions is for the user who works with DataSync to create and start the transfer task (for example, your storage administrator). The other set of permissions allows the DataSync service to transfer objects to the S3 bucket in your destination account on your behalf.
Required permissions for your destination account
For your destination AWS account, you need permission to disable your S3 bucket's access control lists (ACLs) and update the bucket's policy. For more information on these specific permissions, see the Amazon S3 User Guide.
Step 1: In your source account, create a DataSync agent
To get started, you must create a DataSync agent that can read from your on-premises storage system and communicate with AWS. This process includes deploying an agent in your on-premises storage environment and activating the agent in your source AWS account.
Note
The steps in this tutorial apply to any type of agent and service endpoint that you use.
To create a DataSync agent
-
Deploy a DataSync agent in your on-premises storage environment.
-
Choose a service endpoint that the agent will use to communicate with AWS.
-
Activate your agent in your source account.
Step 2: In your source account, create a DataSync source location for your on-premises storage
In your source account, create a DataSync source location for the on-premises storage system that you're transferring data from. This location should use the agent that you just activated in your source account.
Step 3: In your source account, create an IAM role for DataSync
In your source account, you need an IAM role that gives DataSync permission to write to the S3 bucket in your destination account on your behalf.
Normally, when you create a transfer location for an S3 bucket in the DataSync console, DataSync can automatically create and assume a role that has the right permissions to write to that bucket. Since you're transferring across accounts, however, you must create the role manually.
Create the IAM role
Create an IAM role with DataSync as the trusted entity.
To create the IAM role
Log in to the AWS Management Console with your source account.
Open the IAM console at https://console.aws.amazon.com/iam/
. -
In the left navigation pane, under Access management, choose Roles, and then choose Create role.
-
On the Select trusted entity page, for Trusted entity type, choose AWS service.
-
For Use case, choose DataSync in the dropdown list and select DataSync. Choose Next.
-
On the Add permissions page, choose Next.
-
Give your role a name and choose Create role.
For more information, see Creating a role for an AWS service (console) in the IAM User Guide.
Attach a custom policy to the IAM role
The IAM role that you just created needs a policy that allows DataSync to write to the S3 bucket in your destination account.
To attach a custom policy to the IAM role
On the Roles page of the IAM console, search for the role that you just created and choose its name.
On the role's details page, choose the Permissions tab. Choose Add permissions then Create inline policy.
-
Choose the JSON tab and do the following:
Paste the following JSON into the policy editor:
{ "Version": "2012-10-17", "Statement": [ { "Action": [ "s3:GetBucketLocation", "s3:ListBucket", "s3:ListBucketMultipartUploads" ], "Effect": "Allow", "Resource": "arn:aws:s3:::
destination-bucket
" }, { "Action": [ "s3:AbortMultipartUpload", "s3:DeleteObject", "s3:GetObject", "s3:ListMultipartUploadParts", "s3:PutObject", "s3:GetObjectTagging", "s3:PutObjectTagging" ], "Effect": "Allow", "Resource": "arn:aws:s3:::destination-bucket
/*" } ] }-
Replace each instance of
with the name of the S3 bucket in your destination account.destination-bucket
-
Choose Next. Give your policy a name and choose Create policy.
Step 4: In your destination account, disable ACLs for your S3 bucket
It's important that all the data that you copy to the S3 bucket belongs to your destination account. To ensure that this account owns the data, disable the bucket's access control lists (ACLs).
To disable ACLs for an S3 bucket
-
In the AWS Management Console, switch over to your destination account.
Open the Amazon S3 console at https://console.aws.amazon.com/s3/
. -
In the left navigation pane, choose Buckets.
-
In the Buckets list, choose the S3 bucket that you're transferring data to.
-
On the bucket's detail page, choose the Permissions tab.
-
Under Object Ownership, choose Edit.
-
If it isn't already selected, choose the ACLs disabled (recommended) option.
-
Choose Save changes.
For more information, see Controlling ownership of objects and disabling ACLs for your bucket in the Amazon S3 User Guide.
Step 5: In your destination account, update your S3 bucket policy
In your destination account, modify the destination S3 bucket policy to include the DataSync IAM role that you created in your source account.
The updated bucket policy (provided to you in the following instructions) includes two principals:
-
The first principal specifies the DataSync IAM role that you created in your source account. This role allows DataSync to write to the S3 bucket in your destination account.
-
The second principal specifies the IAM role with the required user permissions for working with DataSync in your source account. You need this principal to create the DataSync destination location.
To update the destination S3 bucket policy
-
While still logged in to the S3 console with your destination account, choose the S3 bucket that you're copying data to.
-
On the bucket's detail page, choose the Permissions tab.
-
Under Bucket policy, choose Edit and do the following to modify your S3 bucket policy:
-
Update what's in the editor to include the following policy statements:
{ "Version": "2008-10-17", "Statement": [ { "Sid": "DataSyncCreateS3LocationAndTaskAccess", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::
source-account
:role/source-datasync-role
" }, "Action": [ "s3:GetBucketLocation", "s3:ListBucket", "s3:ListBucketMultipartUploads", "s3:AbortMultipartUpload", "s3:DeleteObject", "s3:GetObject", "s3:ListMultipartUploadParts", "s3:PutObject", "s3:GetObjectTagging", "s3:PutObjectTagging" ], "Resource": [ "arn:aws:s3:::destination-bucket
", "arn:aws:s3:::destination-bucket
/*" ] }, { "Sid": "DataSyncCreateS3Location", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::source-account
:role/source-user-role
" }, "Action": "s3:ListBucket", "Resource": "arn:aws:s3:::destination-bucket
" } ] } -
Replace each instance of
with the AWS account ID for your source account.source-account
-
Replace
with the IAM role that you created for DataSync in your source account.source-datasync-role
-
Replace each instance of
with the name of the S3 bucket in your destination account.destination-bucket
-
Replace
with the IAM role that includes the required user permissions to use DataSync.source-user-role
-
Choose Save changes.
Step 6: In your source account, create a DataSync destination location for your S3 bucket
In your source account, you need to create a DataSync location for the S3 bucket in your destination account.
The DataSync console won't let you create locations for storage resources in another AWS account. However, you can do this by using AWS CloudShell, a browser-based, pre-authenticated shell that you launch directly from the console. CloudShell allows you to run the AWS CLI commands for completing this tutorial without downloading or installing command line tools.
Note
If you want to complete the following steps by using a command line tool other
than CloudShell, make sure your AWS CLI profile uses the same
that you specified
in the destination S3
bucket policy. For more information, see the AWS Command Line Interface User Guide.source-user-role
To create a DataSync destination location by using CloudShell
-
In the AWS Management Console, switch back to your source account.
Open the AWS DataSync console at https://console.aws.amazon.com/datasync/
. -
Do one of the following to launch CloudShell:
-
Choose the CloudShell icon on the console navigation bar. It's located to the right of the search box.
-
Use the search box on the console navigation bar to search for CloudShell and then choose the CloudShell option.
-
-
Copy the following command:
aws datasync create-location-s3 \ --s3-bucket-arn arn:aws:s3:::
destination-bucket
\ --s3-config '{ "BucketAccessRoleArn":"arn:aws:iam::source-user-account
:role/source-datasync-role
" }' -
Replace
with the name of the S3 bucket in your destination account.destination-bucket
-
Replace
with the AWS account ID for your source account.source-user-account
-
Replace
with the DataSync IAM role that you created in your source account.source-datasync-role
-
Run the command in CloudShell.
If the command returns a DataSync location ARN similar to this, you successfully created the location:
{ "LocationArn": "arn:aws:datasync:us-east-2:123456789012:location/loc-abcdef01234567890" }
-
In the left navigation pane, expand Data transfer, then choose Locations.
From your source account, you can see the location of the S3 bucket in the destination account that you just created.
Step 6: In your source account, create and start your DataSync transfer task
Before you move your data, let's recap what you've done so far:
-
In your source account, you deployed and activated your DataSync agent. The agent can read from your on-premises storage system and communicate with AWS.
-
In your source account, you created an IAM role that allows DataSync to write data to the S3 bucket in your destination account.
-
In your destination account, you configured your S3 bucket so that DataSync can access the bucket and write data to it.
-
In your source account, you created the DataSync source and destination locations for your transfer.
To create and start the DataSync transfer task
While still using the DataSync console in your source account, expand Data transfer in the left navigation pane, then choose Tasks and Create task.
-
On the Configure source location page, choose Choose an existing location. Choose the source location that you're copying data from (your on-premises storage) then Next.
-
On the Configure destination location page, choose Choose an existing location. Choose the destination location that you're copying data to (the S3 bucket in your destination account) then Next.
-
On the Configure settings page, give the task a name. As needed, configure additional settings, such as specifying an Amazon CloudWatch log group. Choose Next.
-
On the Review page, review your settings and choose Create task.
-
On the task's details page, choose Start, and then choose one of the following:
-
To run the task without modification, choose Start with defaults.
-
To modify the task before running it, choose Start with overriding options.
-
When your task finishes, check the S3 bucket in your destination account. You should see the data that moved from your source account bucket.
Troubleshooting
Refer to the following information if you run into issues trying to complete your cross-account transfer.
- Permissions errors
-
When setting up a cross-account transfer with Amazon S3, you might see permissions errors. For example, here's a common permissions error when trying to create an S3 destination location:
An error occurred (InvalidRequestException) when calling the CreateLocationS3 operation: DataSync location access test failed: could not perform s3:HeadBucket on bucket
DOC-EXAMPLE-DESTINATION-BUCKET
. Access denied. Ensure bucket access role has s3:ListBucket permission.This error means that your source AWS account user permissions are missing the
s3:ListBucket
permission. These permissions are for the user who creates and starts DataSync tasks. Adds3:ListBucket
to your user permissions and try again to create the destination location.
Related resources
For more information about what you did in this tutorial, see the following topics: