AWS DataSync
User Guide

Step 2: Create Locations

Each DataSync task comprises of a pair of locations which data will be transferred between. The source location defines the storage system or service that you want to read data from, and the destination location defines the storage system or service that you want to write data to.

DataSync supports the following source and destination location combinations.

Source (From) Destination (To)

NFS or SMB file system

Amazon EFS file system

NFS or SMB file system

Amazon S3

Amazon EFS

NFS or SMB file system

Amazon S3

NFS file system or Amazon EFS

Note

When copying between two Amazon EFS file systems, we recommend using the NFS (source) to EFS (destination) transfer.

Create an NFS Location

The following procedure shows you how to create an NFS location that is on-premises or in the cloud. This location defines a file system on a Network File System (NFS) server that can be read from or written to.

To create an NFS location

  • Use the following commands to create an NFS source location.

    $ aws datasync create-location-nfs --server-hostname server-address --on-prem-config AgentArns=agent-arns --subdirectory nfs-export-path

Note

The AWS Region here is the AWS Region where your S3 bucket or EFS file system is located.

The path you provide for the --subdirectory parameter should be a path that's exported by the NFS server, or a subdirectory. This path should be such that it can be mounted by other NFS clients in your network. To see all the paths exported by your NFS server. run "showmount -e nfs-server-address" from an NFS client with access to your server. You can specify any directory that appears in the results, and any subdirectory of that directory.

To transfer all the data in the folder you specified, DataSync needs to have permissions to read all the data. To ensure this, either configure the NFS export with no_root_squash, or ensure that the permissions for all of the files you want DataSync allow read access for all users. Doing either enables the agent to read the files. For the agent to access directories, you must additionally enable all execute access.

Ensure that the NFS export is accessible without Kerberos authentication.

DataSync automatically selects the NFS version that it uses to read from an NFS location. To force DataSync to use a specific NFS version, use the optional Version parameter in the NfsMountOptions API.

These commands return the Amazon Resource Name (ARN) of the NFS location similar to the ARN show following.

{ "LocationArn": "arn:aws:datasync:us-east-1:111222333444:location/loc-0f01451b140b2af49" }

To make sure that the directory can be mounted, you can connect to any machine that has the same network configuration as your agent, and run the following command.

mount -t nfs -o nfsvers=<nfs-server-version> <nfs-server-address>:<nfs-export-path> <test-folder>.

The following is an example of the command.

mount -t nfs -o nfsvers=3 172.123.12.456:/path_for_sync_to_read_from /temp_folder_to_test_mount_on_local_machine.

Create an SMB Location

The following procedure shows you how to create an SMB location that is on-premises. This location defines a file system on an Server Message Block (SMB) server that can be read from or written to.

To create an SMB location

  • Use the following commands to create an NFS source location.

    $ aws datasync create-location-nfs --server-hostname smb-server-address --user user-name --domain domain-of-the-smb-server --password user's-password AgentArns=agent-arns --subdirectory smb-export-path

    Note

    The AWS Region here is the AWS Region where your S3 bucket or EFS file system is located.

    The path you provide for the --subdirectory parameter should be a path that's exported by the SMB server, or a subdirectory. This path should be such that it can be mounted by other SMB clients in your network.

    DataSync automatically selects the SMB version that it uses to read from an SMB location. To force DataSync to use a specific SMB version, use the optional Version parameter in the NfsMountOptions API.

    These commands return the Amazon Resource Name (ARN) of the NFS location similar to the ARN show following.

    These commands return the Amazon Resource Name (ARN) of the NFS location similar to the ARN show following.

    { "LocationArn": "arn:aws:datasync:us-east-1:111222333444:location/loc-0f01451b140b2af49" }

Create an Amazon EFS Location

The following procedure shows you how to create an EFS location. This is the endpoint for an Amazon EFS file system.

To create an Amazon EFS location

  1. If you don't have an Amazon EFS file system, create one. For information about how to create an EFS file system, see Getting Started with Amazon Elastic File System in the Amazon Elastic File System User Guide.

  2. Identify a subnet that has at least one mount target for that file system. You can see all the mount targets and the subnets associated with an EFS file system by using the describe-mount-targets command.

    $ aws --region aws-region efs describe-mount-targets --file-system-id file-system-id

    Note

    The AWS Region that you specify is the one where your target S3 bucket or EFS file system is located.

    These commands return information about the target similar to the one shown following.

    { "MountTargets": [ { "OwnerId": "111222333444", "MountTargetId": "fsmt-22334a10", "FileSystemId": "fs-123456ab", "SubnetId": "subnet-f12a0e34", "LifeCycleState": "available", "IpAddress": "11.222.0.123", "NetworkInterfaceId": "eni-1234a044" } ] }
  3. Specify an Amazon EC2 security group that can be used to access the mount target. You can run the following command to find out the security group of the mount target.

    $ aws --region aws-region efs describe-mount-target-security-groups --mount-target-id mount-target-id

    The security group you provide needs to be able to communicate with the security group on the mount target in the subnet specified.

    The exact relationship between security group M (of the mount target) and security group S (that you provide for DataSync to use at this stage) are:

    • Security group M (which you associate with the mount target) must allow inbound access for the TCP protocol on the NFS port (2049) from Security group S.

      You can enable inbound connections either by IP address (CIDR range) or security group.

    • Security group S (provided to DataSync to access EFS) should have a rule that enables outbound connections to the NFS port. It does this on one of the file system’s mount targets.

      You can enable outbound connections either by IP address (CIDR range) or security group.

      For information about security groups and mount targets, see Security Groups for Amazon EC2 Instances and Mount Targets.

  4. Create the EFS location. To create the EFS location, you need the ARNs for your Amazon EC2 subnet, Amazon EC2 Security group, and an EFS file system. Because DataSync’s API accepts fully qualified ARNs, you can construct these ARNs. For information about how to construct ARNs for different services, see Amazon Resource Names (ARNs) and AWS Service Namespaces in the Amazon Web Services General Reference.

    Use the following commands to create an EFS location.

    $ aws datasync create-location-efs --subdirectory /path/to/your/subdirectory --efs-filesystem-arn 'arn:aws:elasticfilesystem:region:account-id:file-system/filesystem-id' --ec2-config SecurityGroupArns='arn:aws:ec2:region:account-id:security-group/security-group-id',SubnetArn='arn:aws:ec2:region:account-id:subnet/subnet-id'

Note

The AWS Region that you specify is the one where your target S3 bucket or EFS file system is located.

The commands return a location ARN similar to the one shown following.

{ "LocationArn": "arn:aws:datasync:us-east-1:111222333444:location/loc-07db7abfc326c50fb" }

Create an Amazon S3 Location

For DataSync to access a destination Amazon S3 bucket, it needs an AWS Identity and Access Management (IAM) role that has the required permissions. Use the procedure following to create the role, required policies, and S3 location.

To create and S3 location

  1. Create a trust policy that allows DataSync to assume the role required to access the S3 bucket.

    The following is an example of the trust policy.

    { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": "datasync.amazonaws.com" }, "Action": "sts:AssumeRole" } ] }
  2. Create a temporary file for the policy as shown in the following example.

    $ ROLE_FILE=$(mktemp -t sync.iam.role.XXXXXX.json) $ IAM_ROLE_NAME='YourBucketAccessRole' $ cat<<EOF> ${ROLE_FILE} { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": "datasync.amazonaws.com" }, "Action": "sts:AssumeRole" } ] } EOF
  3. Create an IAM role and attach the policy to it.

    The following command creates an IAM role and attaches the policy to it.

    $ aws iam create-role --role-name ${IAM_ROLE_NAME} --assume-role-policy-document file://${ROLE_FILE} { "Role": { "Path": "/", "RoleName": "YourBucketAccessRole", "RoleId": "role-id", "Arn": "arn:aws:iam::account-id:role/YourBucketAccessRole", "CreateDate": "2018-07-27T02:49:23.117Z", "AssumeRolePolicyDocument": { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": "datasync.amazonaws.com" }, "Action": "sts:AssumeRole" } ] } } }
  4. Allow the IAM role that you created to write to the bucket.

    Attach a policy that has sufficient permissions to access the bucket to the role (for example, AmazonS3FullAccess). You can also create a policy that is more restrictive. If you do, the minimal permissions needed for DataSync to read and write to an S3 location are shown the following example.

    { "Version": "2012-10-17", "Statement": [ { "Action": [ "s3:GetBucketLocation", "s3:ListBucket", "s3:ListBucketMultipartUploads", "s3:HeadBucket" ], "Effect": "Allow", "Resource": "arn:aws:s3:::YourBucket" }, { "Action": [ "s3:AbortMultipartUpload", "s3:DeleteObject", "s3:GetObject", "s3:ListMultipartUploadParts", "s3:PutObject" ], "Effect": "Allow", "Resource": "arn:aws:s3:::YourBucket/*" } ] }

    To attach the policy to your IAM role, run the following command.

    $ aws iam attach-role-policy --role-name role-name --policy-arn 'arn:aws:iam::aws:policy/AmazonS3FullAccess'
  5. Create the S3 location.

    Use the following commands to create your Amazon S3 location.

    $ aws datasync create-location-s3 --s3-bucket-arn 'arn:aws:s3:::bucket' --s3-storage-class 'your-S3-storage-class' --s3-config 'BucketAccessRoleArn=arn:aws:iam::account-id:role/role-name' subdirectory /your-folder

    The commands return a location ARN similar to the one shown following.

    { "LocationArn": "arn:aws:datasync:us-east-1:111222333444:location/loc-0b3017fc4ba4a2d8d" }

You can see information about your S3 location that you just created by using the describelocation-s3 command.

The location type information is encoded in the LocationUri of every location description, regardless of the location type. In the example preceding, the s3:// prefix in LocationUri shows the location’s type.

Note

If versioning is enabled for S3, and you configure DataSync to copy file metadata, DataSync creates a new object every time that the corresponding file’s metadata is updated.