Menu
AWS Snowball
User Guide

This guide is for the Snowball (50 TB or 80 TB of storage space). If you are looking for documentation for the Snowball Edge, see the AWS Snowball Edge Developer Guide.

Options for the snowball cp Command

Following, you can find information about snowball cp command options and also syntax guidelines for using this command. You use this command to transfer data from your workstation to a Snowball.

Command Option Description
-b, --batch

String.

Significantly improves the transfer performance for small files by batching them into larger .snowballarchives files. You can change the following defaults to specify when a file is included in a batch:

  • By default, files that are 200 KB or smaller are included in batches. This setting can be changed by specifying the --batchFileSizeInKBLimit option with a new maximum file size, in kilobytes. Maximum file sizes range from 100 KB to 1 MB. Files that are larger than the specified maximum file size are transferred to the Snowball as individual files and not included in any batches.

  • By default, batches hold up to 10,000 files. This limit can be changed by setting the --batchNumOfFiles option. The number of files in a batch can range from 5,000 to 100,000 files.

During import into Amazon S3, batches are extracted and the original files are imported into Amazon S3. Only .snowballarchives files that were created during the copy command with this option are extracted automatically during import.

Important

This option doesn't work with the --resume option. If a copy operation with the --batch option fails, or is stopped before completion, we recommend that you find the .snowballarchives file or files created and delete them manually from your Snowball. When they are deleted, try the batch copy again. Doing so helps prevent wasted space on the Snowball.

--checksum

On and set to false by default.

Calculates a checksum for any source and destination files with the same name, and then compares the checksums. This command option is used when a copy operation is resumed. Using this option adds computational overhead during your copy operation.

Note

When this option isn't used, a faster comparison of just file names and dates occurs when you resume as copy operation.

-f, --force

On and set to false by default. This command option has two uses:

  • When used with a copy command, -f overwrites any existing content on the destination that matches the path and name of the content being transferred.

  • When used after a copy command is run, -f overrides the --resume command option. Instead, your copy operation is performed from the beginning again, overwriting any existing content on the destination with the same path and name.

Note

The preceding use cases are not mutually exclusive. We recommend that you use -f with care to prevent delays in data transfer.

-h, --help

On and set to false by default.

Displays the usage information for the snowball cp command in the terminal.

-r, --recursive

On and set to false by default.

Recursively traverses directories during the snowball cp command's operation.

--resume

On and set to true by default.

When this option is set, if the copy command stops unexpectedly you can resume from where you left off. To do so, run the snowball cp command again with the same options and paths. Content that was already copied over successfully is skipped, and only the content not yet copied to the Snowball is copied.

Important

This option does not work with the --batch option. If a copy operation with the --batch option fails, or is stopped before completion, we recommend that you find the .snowballarchives file or files created and delete them manually from your Snowball. When they are deleted, try the batch copy again. Doing so helps prevent wasted space on the Snowball.

-s, --stopOnError

On and set to false by default.

Stops the snowball cp command's operation if it encounters an error.

In addition to the previously defined Snowball client copy command options, there are some options specific to transferring data from an HDFS cluster. The following table describes those options. For more information on transferring from an HDFS cluster, see Importing Data from HDFS.

Important

The --batch option for the Snowball client's copy command is not supported for HDFS data transfers. If you must transfer a large number of small files from an HDFS cluster, we recommend that you find a method of collecting them into larger archive files, and then transferring those. However, these archives are what is imported into Amazon S3. If you want the files in their original state, take them out of the archives after importing the archives.

HDFS-Specific Command Option Description
--hdfsconfig

Used with the hdfs:// import schema, this option sets the path to a custom XML configuration file on the server running your HDFS cluster. This option must be repeated if you have multiple configuration files. For example, the following specifies two configuration files.

--hdfsconfig src/core/Namenode-site.xml --hdfsconfig /hdfs/corp/conf/hdfs-site.xml

-k

On and set to false by default.

Used with the hdfs:// import schema and the -p option, this option sets the path to the keytab file used to authenticate the Snowball client's connection to the HDFS cluster before then copying data to a Snowball.

Note

You must have both the principal and the keytab registered with the Kerberos authentication server used to authenticate the HDFS cluster. If you recently ran the kinit command on your terminal, then you don't need to specify this option.

-n

On and set to false by default.

Used with the hdfs:// import schema, this option copies data from a nonsecure HDFS cluster.

-p

On and set to false by default.

Used with the hdfs:// import schema and the -k option, this option sets the principal used to authenticate the Snowball client's connection to the HDFS cluster before then copying data to a Snowball.

Note

You must have both the principal and the keytab registered with the Kerberos authentication server used to authenticate the HDFS cluster. If you recently ran the kinit command on your terminal, then you don't need to specify this option.