AWS Snowball
Developer Guide

This guide is for the Snowball Edge. If you are looking for documentation for the Snowball, see the AWS Snowball User Guide.

Administrating a Cluster

Following, you can find information about administrative tasks to operate a healthy cluster of Snowball Edge devices. The primary administrative tasks are covered in the following topics.

Most administrative tasks require that you use the Snowball client and its commands that perform the following actions:

Reading and Writing Data to a Cluster

After you've unlocked a cluster, you're ready to read and write data to it. You can use the Amazon S3 Adapter for Snowball to read and write data to a cluster. For more information, see Using the Amazon S3 Adapter.

To write data to a cluster, you must have a read/write quorum with no more than one unavailable node. To read data from a cluster, you must have a read quorum of no more than two unavailable nodes. For more information on quorums, see Snowball Edge Cluster Quorums.

Reconnecting an Unavailable Cluster Node

A node can become temporarily unavailable due to an issue (like power or network loss) without damaging the data on the node. When this happens, it affects the status of your cluster. A node's network reachability and lock status is reported in the Snowball client by using the snowballEdge describe-cluster command.

We recommend that you physically position your cluster so you have access to the front, back, and top of all nodes. This way, you can access the power and network cables on the back, the shipping label on the top to get your node ID, and the LCD screen on the front of the device for the IP address and other administrative information.

When you detect that a node is unavailable, we recommend that you try one of the following procedures, depending on the scenario that caused the unavailability.

To reconnect an unavailable node

  1. Ensure that the node has power.

  2. Ensure that the node is connected to the same internal network that the rest of the cluster is on.

  3. Wait for the node to finish powering up, if it needed to be powered up.

  4. Run the snowballEdge unlock-cluster command, or the snowballEdge associate-device command. For an example, see Unlocking AWS Snowball Edge Devices.

To reconnect an unavailable node that lost network but didn't lose power

  1. Ensure that the node is connected to the same internal network that the rest of the cluster is on.

  2. Run the snowballEdge describe-device command to see when the previously unavailable node is added back to the cluster. For an example, see Getting Device Status.

When you have performed the preceding procedures, your nodes should be working normally. You should also have a read/write quorum. If that's not the case, then one or more of your nodes might have a more serious issue and might need to be removed from the cluster.

Removing an Unhealthy Node from a Cluster

Rarely, a node in your cluster might become unhealthy. If the node is unavailable, we recommend going through the procedures listed in Reconnecting an Unavailable Cluster Node first.

If doing so doesn't resolve the issue, then the node might be unhealthy. An unhealthy node can occur if the node has taken damage from an external source, if there was an unusual electrical event, or if some other unlikely event occurs. If this happens, you need to remove the node from the cluster before you can add a new node as a replacement.

When you detect that a node is unhealthy and needs to be removed, we recommend that you do so with the following procedure.

To remove an unhealthy node

  1. Ensure that the node is unhealthy and not just unavailable. For more information, see Reconnecting an Unavailable Cluster Node.

  2. Disconnect the unhealthy node from the network and power it off.

  3. Run the snowballEdge dissassociate-device Snowball client command. For more information, see Removing a Node from a Cluster.

  4. Order a replacement node using the console, the AWS CLI, or one of the AWS SDKs.

  5. Return the unhealthy node to AWS. When we have the node, we perform a complete erasure of the device. This erasure follows the National Institute of Standards and Technology (NIST) 800-88 standards.

After you successfully remove a node, your data is still available on the cluster if you still have a read quorum. To have read quorum, a cluster must have no more than two unavailable nodes. Therefore, we recommend that you order replacement nodes as soon as you remove an unavailable node from the cluster.

Adding or Replacing a Node in a Cluster

You can add a new node after you have removed an unhealthy node from a cluster. You can also add a new node to increase local storage.

To add a new node, you first need to order a replacement. You can order a replacement node from the console, the AWS CLI, or one of the AWS SDKs. If you're ordering a replacement node from the console, you can order replacements for any job that hasn't been canceled or completed.

To order a replacement node from the console

  1. Sign in to the AWS Snowball Management Console.

  2. Find and choose a job for a node that belongs to the cluster that you created from the Job dashboard.

  3. For Actions, choose Replace node.

    Doing this opens the final step of the job creation wizard, with all settings identical to how the cluster was originally created.

  4. Choose Create job.

Your replacement Snowball Edge is now on its way to you. When it arrives, use the following procedure to add it to your cluster.

To add a replacement node

  1. Position the new node for the cluster such that you have access to the front, back, and top of all nodes.

  2. Ensure that the node has power.

  3. Ensure that the node is connected to the same internal network that the rest of the cluster is on.

  4. Wait for the node to finish powering up, if it needed to be powered on.

  5. Run the snowballEdge associate-device command. For an example, see Adding a Node to a Cluster.