AWS DataSync
User Guide

The AWS Documentation website is getting a new look!
Try it now and let us know what you think. Switch to the new look >>

You can return to the original look by selecting English in the language selector above.

Deploy an AWS DataSync Agent

You can deploy a AWS DataSync agent in either of the following ways:

Your agent can connect to public internet endpoints or private VPC endpoints. The activation process associates your agent with your AWS account.

Deploy Your DataSync Agent on VMware

You can download and deploy an AWS DataSync agent in your VMware environment and then activate it. You can also use an existing agent instead of deploying a new one. You can use a previously created agent if it can access your on-premises storage and if it's activated in the same AWS Region.

To deploy an agent on VMware

  1. If you don't have an agent, on the Create agent page in the console, choose Download image in the Deploy agent section. Doing this downloads the agent and deploys it in your VMware ESXi hypervisor. The agent is available as a VM. If you want to deploy the agent as an Amazon EC2 instance, see Deploy Your Agent as an EC2 Instance to Read Files from In-Cloud.

    AWS DataSync currently supports the VMware ESXi hypervisor. For information about hardware requirements for the VM, see Virtual Machine Requirements. For information about how to deploy an .ova file in a VMware host, see the documentation for your hypervisor.

    If you have previously activated an agent in this AWS Region and want to use that agent, choose that agent and choose Create agent. The Configure a Source Location page appears.

  2. Power on your hypervisor, log in to your VM, and get the IP address of the agent. You need this IP address to activate the agent.

    Note

    The VM's default credentials are the login admin and the password password.

    You can change the password on the local console. You don't need to log in to the VM for DataSync functionality. Login is mainly required for troubleshooting, such as running a connectivity test or opening a support channel with AWS. It's also required for network-specific settings, such as setting up a static IP address.

After you have deployed an agent, you choose a service endpoint.

Deploy Your Agent as an EC2 Instance to Read Files from In-Cloud

You can use your agent to transfer data between two locations in AWS, including cross-region and cross-account transfers. Doing this enables you to perform the following tasks:

  • Transfer data from one EFS file system to another – migrate data from one AWS account to another, or periodically copy recently added files to a second EFS file system.

  • Migrate from self-managed NFS to EFS – migrate to benefit from a more scalable, fully managed, elastic, and highly available file storage that has an NFS interface.

  • Transfer data from Amazon S3 to in-cloud NFS, and from in-cloud NFS to Amazon S3 – use this approach for cases such as high-performance computing (HPC) processing.

To get started, choose the Amazon Machine Image (AMI) for your agent for the AWS Region where your EFS or self-managed NFS file system resides:

  • To copy between EFS file systems, or from a self-managed NFS to EFS, create the EC2 agent in the source AWS Region.

  • To copy from S3, create the agent in the destination AWS Region.

Important

We don't recommend using a DataSync agent that is deployed as an EC2 instance to read data from an on-premises NFS server. This approach doesn't deliver maximum throughput.

You can use the procedures following to transfer files from an in-cloud NFS file system to Amazon S3. In this case, the in-cloud NFS file system is an Amazon EFS file system.

To choose the agent AMI for your AWS Region

  • You can use the following CLI command to programmatically get the latest AMI ID for DataSync.

    aws ssm get-parameter --name /aws/service/datasync/ami --region $region

    Example command and output

    aws ssm get-parameter --name /aws/service/datasync/ami --region us-east-1 { "Parameter": { "Name": "/aws/service/datasync/ami", "Type": "String", "Value": "ami-01234db92d824a123", "Version": 6, "LastModifiedDate": 1569946277.996, "ARN": "arn:aws:ssm:us-east-1::parameter/aws/service/datasync/ami" } }

    You can also identify the AMI ID for your AWS Region in the following table. You use this AMI ID to deploy your DataSync agent. For the recommended instance types, see Amazon EC2 Instance Requirements .

    If you activated an agent in this AWS Region and want to use that agent, choose the agent and choose Create agent. The Configure a Source Location page appears.

    In the following table, you can find the available DataSync AMIs by AWS Region.

    AWS Region AMI Name AMI ID URL
    ap-northeast-1 aws-datasync-1569507227 ami-083d930199b517fc8 Launch instance
    ap-northeast-2 aws-datasync-1569507227 ami-03d858a112a65b4b0 Launch instance
    ap-southeast-1 aws-datasync-1569507227 ami-0bc229d430d9cd6b6 Launch instance
    ap-southeast-2 aws-datasync-1569507227 ami-0786ddae86abf0362 Launch instance
    ca-central-1 aws-datasync-1569507227 ami-0a17712db83f3f852 Launch instance
    eu-central-1 aws-datasync-1569507227 ami-0b433b5eddaddf1bb Launch instance
    eu-west-1 aws-datasync-1569507227 ami-031e8db602e4ed16f Launch instance
    eu-west-2 aws-datasync-1569507227 ami-0036f42661dd3512d Launch instance
    eu-west-3 aws-datasync-1569507227 ami-00a50aa3d89a1d6c1 Launch instance
    me-south-1 aws-datasync-1569507227 ami-0c563edbfd36aef7f Launch instance
    us-east-1 aws-datasync-1569507227 ami-08060db92d824f291 Launch instance
    us-east-2 aws-datasync-1569507227 ami-0b350e66c3b082eac Launch instance
    us-west-1 aws-datasync-1569507227 ami-05d76395fd50e3d80 Launch instance
    us-west-2 aws-datasync-1569507227 ami-01a8854868b5df8da Launch instance
    us-gov-west-1 aws-datasync-1569507261 ami-08ca9f69 Launch instance

To deploy your DataSync agent as an EC2 instance

  1. From the AWS account where the source EFS resides, launch the agent using your AMI from the Amazon EC2 launch wizard. Use the following URL to launch the AMI.

    https://console.aws.amazon.com/ec2/v2/home?region=source-efs-or-nfs-region#LaunchInstanceWizard:ami=ami-id.

    In the URL, replace the source-efs-or-nfs-region and ami-id with your own source AWS Region and AMI ID. The Choose an Instance Type page appears on the Amazon EC2 console. For a list of AMI IDs by AWS Region, see Deploy Your Agent as an EC2 Instance to Read Files from In-Cloud.

  2. Choose one of the recommended instance types for your use case, and choose Next: Configure Instance Details. For the recommended instance types, see Amazon EC2 Instance Requirements .

  3. On the Configure Instance Details page, do the following:

    1. For Network, choose the virtual private cloud (VPC) where your source EFS or NFS file system is located.

    2. Choose a value for Auto-assign Public IP. For your instance to be accessible from the public internet, set Auto-assign Public IP to Enable. Otherwise, set Auto-assign Public IP to Disable. If a public IP address isn't assigned, activate the agent in your VPC using its private IP address.

      When you transfer files from an in-cloud NFS, to increase performance we recommend that you choose a Placement Group value where your NFS server resides.

  4. Choose Next: Add Storage. The agent doesn't require additional storage, so you can skip this step and choose Next: Add tags.

  5. (Optional) On the Add Tags page, you can add tags to your EC2 instance. When you're finished on the page, choose Next: Configure Security Group.

  6. On the Configure Security Group page, do the following:

    1. Make sure that the selected security group allows inbound access to HTTP port 80 from the web browser that you plan to use to activate the agent.

    2. Make sure that the security group of the source EFS or NFS system allows inbound traffic from the agent. In addition, make sure that the agent allows outbound traffic to the source EFS or NFS system. The traffic goes through the standard NFS port, 2049.

    For the complete set of network requirements for DataSync, see Network Requirements.

    If you deploy your agent using a VPC endpoint, you need to allow additional ports. For information, see Considerations When Creating an Agent in a VPC.

  7. Choose Review and Launch to review your configuration, then choose Launch to launch your instance. Remember to use a key pair that's accessible to you. A confirmation page appears and indicates that your instance is launching.

  8. Choose View Instances to close the confirmation page and return to the EC2 instances screen. When you launch an instance, its initial state is pending. After the instance starts, its state changes to running. At this point, it is assigned a public Domain Name System (DNS) name and IP address, which can be found in the Descriptions tab.

  9. If you set Auto-assign Public IP to Enable, choose your instance and note the public IP address in the Description tab. You use this IP address later to connect to your sync agent.

    If you set Auto-assign Public IP to Disable, launch or use an existing instance in your VPC to activate the agent. In this case, you use the private IP address of the sync agent to activate the agent from this instance in the VPC.

After you have deployed an agent, you choose a service endpoint.