Configuring transfers with S3 compatible storage on Snowball Edge - AWS DataSync

Configuring transfers with S3 compatible storage on Snowball Edge

With AWS DataSync, you can transfer objects between Amazon S3 compatible storage on an AWS Snowball Edge device or cluster and any of the following AWS storage services:

Prerequisites

Before you get started, make sure that you've done the following:

  • Created an AWS storage resource in the AWS Region where you plan to transfer data to or from. For example, this could be an S3 bucket or Amazon EFS file system in US East (N. Virginia).

  • Established a wide-area network (WAN) connection for traffic into and out of your on-premises storage environment. For example, you can establish this kind of connection with AWS Direct Connect.

    When you create your DataSync agent, you'll configure this WAN connection so that DataSync can transfer data between your Amazon S3 compatible storage that's on-premises and your storage resource in AWS.

  • Downloaded and installed the Snowball Edge client.

Accessing your Amazon S3 compatible storage

To access your Amazon S3 compatible storage bucket, DataSync needs the following:

  • User credentials on your Snowball Edge device or cluster that can access the bucket that you're transferring data to or from.

  • An HTTPS certificate that allows DataSync to verify the authenticity of the connection between the DataSync agent and the s3api endpoint on your device or cluster.

Getting the user credentials to access your S3 bucket

DataSync needs the access key and secret key for a user who can access the bucket that you're working with on your Snowball Edge device or cluster.

To get the user credentials to access your bucket
  1. Open a terminal and run the Snowball Edge client.

    For more information about running the Snowball Edge client, see Using the Snowball Edge client in the AWS Snowball Edge Developer Guide.

  2. To get the access keys associated with your device or cluster, run the following snowballEdge command:

    snowballEdge list-access-keys
  3. In the output, locate the access key for the bucket that DataSync will work with (for example, AKIAIOSFODNN7EXAMPLE).

  4. To get the secret access key, run the following snowballEdge command. Replace access-key-for-datasync with the access key that you located in the prior step.

    snowballEdge get-secret-access-key --access-key-id access-key-for-datasync

    The output includes the access key's corresponding secret key (for example, wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY).

  5. Save the access key and secret key somewhere that you can remember.

    You will need these keys when you're configuring the DataSync source location for your transfer.

Getting a certificate for the s3api endpoint connection

You need an HTTPS certificate that can verify the authenticity of the connection between your DataSync agent and an s3api endpoint on your Snowball Edge device or cluster.

To get a certificate for the s3api endpoint connection
  1. In the Snowball Edge client, run the following snowballEdge command:

    snowballEdge get-certificate
  2. Save the output to a base64-encoded .pem file.

    You will specify this file when you're configuring the DataSync source location for your transfer.

Creating a DataSync agent in your on-premises storage environment

During a transfer, DataSync uses an agent to read from or write to the Amazon S3 compatible storage on your Snowball Edge device or cluster.

This agent must be deployed in your on-premises storage environment where it can connect to your device or cluster through your network. For example, you can run the agent on a VMware ESXi hypervisor that has local network access to your cluster.

To create a DataSync agent in your on-premises storage environment
  1. Make sure that the DataSync agent can run on your hypervisor and that you allocate the agent enough virtual machine (VM) resources.

  2. Deploy the agent in your on-premises environment.

    For instructions, see one of the following topics, depending on the type of hypervisor that you're deploying the agent on:

  3. Configure your network to allow the following traffic between the agent and your Amazon S3 compatible storage:

    From To Protocol and port

    DataSync agent

    A virtual network interface (VNI) for an s3api endpoint on your device or cluster. If you have a cluster, it can be any s3api endpoint VNI.

    TCP 443 (HTTPS)

    If you need to find a VNI on your device or cluster, see describing your virtual network interfaces on Snowball Edge.

  4. Choose a service endpoint that the agent will use to communicate with AWS.

  5. Activate your agent.

Configuring the source location for your transfer

After you create your agent, you can configure the source location for your DataSync transfer.

Note

The following instructions assume that you're transferring from Amazon S3 compatible storage, but you can also use this location for a transfer destination.

To configure the source location for your transfer by using the DataSync console
  1. Open the AWS DataSync console at https://console.aws.amazon.com/datasync/.

  2. In the left navigation pane, expand Data transfer. Choose Tasks, and then choose Create task.

  3. On the Configure source location page, choose Create a new location.

  4. For Location type, choose Object storage.

  5. For Agents, choose the DataSync agent that you created in your on-premises storage environment.

  6. For Server, enter the VNI for the s3api endpoint that's used by your Amazon S3 compatible storage.

    If you have a Snowball Edge cluster instead of a single device, you can specify any of the cluster's s3api endpoint VNIs.

  7. For Bucket name, enter the name of the Amazon S3 compatible storage bucket that you're transferring objects from.

  8. For Folder, enter an object prefix.

    DataSync only transfers objects with this prefix.

  9. To configure the DataSync connection to the Snowball Edge device or cluster, expand Additional settings and do the following:

    1. For Server protocol, choose HTTPS.

    2. For Server port, enter 443.

    3. For Certificate, choose the certificate file for the s3api endpoint connection.

  10. Select Requires credentials, and enter the Access key and Secret key to access the Amazon S3 compatible storage bucket on your Snowball Edge device or cluster.

  11. Choose Next.

Configuring the destination location for your transfer

Your transfer's destination location must be in the same AWS Region and AWS account where you created your agent.

Before you begin: Make sure you've configured the source location for your transfer.

To configure the destination location for your transfer by using the DataSync console
  1. On the Configure destination location page, choose Create a new location or Choose an existing location for the AWS storage resource where you're transferring objects to.

    If you're creating a new location, see one of the following topics:

  2. When you're done configuring the destination location, choose Next.

Configuring your transfer settings

With DataSync, you can specify a transfer schedule, customize how your data integrity is verified, and specify whether you want to transfer only a subset of objects, among other options.

Before you begin: Make sure you've configured the destination location for your transfer.

To configure your transfer settings by using the DataSync console
  1. On the Configure settings page, change the transfer settings or use the defaults.

    For more information about these settings, see Working with AWS DataSync transfer tasks.

  2. Choose Next.

  3. Review your transfer details, and then choose Create task.

Starting your transfer

After you create your transfer task, you're ready to start moving data. For instructions on starting a task by using the DataSync console or AWS CLI, see Starting your task.