Amazon S3 Glacier - AWS Prescriptive Guidance

Amazon S3 Glacier

Amazon S3 Glacier is a low-cost, cloud-archive storage service that provides secure and durable storage for data archiving and online backup. To keep costs low, S3 Glacier provides three storage classes from a few milliseconds to hours. S3 Glacier Flexible Retrieval and S3 Glacier Deep Archive provide additional options based on how quickly you need to restore the data. With S3 Glacier, you can reliably store large or small amounts of data at significant savings compared to on-premises solutions. S3 Glacier is well suited for storage of backup data with long or indefinite retention requirements and for long-term data archiving. S3 Glacier provides the following storage classes:

  • S3 Glacier Instant Retrieval for archiving data that might be needed once per quarter and needs to be restored quickly (milliseconds)

  • S3 Glacier Flexible Retrieval for archiving data that might infrequently need to be restored, once or twice per year, within a few hours

  • S3 Glacier Deep Archive for archiving long-term backup cycle data that might infrequently need to be restored within 12 hours

The following table summarizes the archive retrieval options.

Storage class Expedited Standard Bulk

S3 Glacier Instant Retrieval

Not applicable

Not applicable

Not applicable

S3 Glacier Flexible Retrieval

1–5 minutes

3–5 hours

5–12 hours

S3 Glacier Deep Archive

Not available

Within 12 hours

Within 48 hours

Using Amazon S3, you can set the storage class for each object in your S3 bucket when you create it. After the object is created, you can change the storage class by copying the object to a new object with a different storage class. Or you can enable a lifecycle configuration that will automatically change the storage class of the objects based on the rules you specify.

To automate your backup and restore processes, you can access Amazon S3 Glacier and S3 Glacier Deep Archive via the AWS Management Console, AWS CLI, and AWS SDKs. For more information, see Amazon S3 Glacier.

Using Amazon S3 Lifecycle object transition to Amazon S3 Glacier compared with managing Amazon S3 Glacier archives

Amazon S3 provides convenient transition of S3 objects into Amazon S3 Glacier storage classes, so that you can manage the lifecycle and costs for your backups. However, depending on the size of the objects and whether you must restore a collection of objects for different components in your architecture, you might want to manage this process yourself.

If you have a large number of small objects that must be restored collectively, consider the cost implications of the following options:

  • Using a lifecycle policy to automatically transition objects individually to Amazon S3 Glacier

  • Zipping objects into a single file and storing them in Amazon S3 Glacier

Amazon S3 Glacier has minimum capacity charges for each object depending on the storage class you use. For example, S3 Glacier Instant Retrieval has a minimum capacity charge of 128 KB for each object. See the performance chart for the most up-to-date information.

For each object that you archive to S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive, Amazon S3 uses 8 KB of storage for the object name and other metadata. Amazon S3 stores this metadata so that you can get a real-time list of your archived objects by using the Amazon S3 API. You are charged S3 Standard rates for this additional storage.

Amazon S3 also adds 32 KB of storage for index and related metadata for each object that is archived to S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive storage classes. This extra data is necessary to identify and restore your object. You are charged Amazon S3 Glacier or S3 Glacier Deep Archive rates for this additional storage.

By zipping your objects into a single file, you can reduce the additional storage used by Amazon S3 Glacier as well as avoid minimum capacity charges for many small objects..

Another important consideration is that lifecycle policies are applied to objects individually. This can impact the integrity of your backup if a collection of objects must be restored collectively from a specific point in time. There is no guarantee that all objects transition at the same time even with the same expiration and lifecycle transition time set across objects. There might be a delay between when the lifecycle rule is satisfied and when the action for the rule is complete. For more information, see the AWS Knowledge Center.

Finally, consider the restoration effort between using archives from lifecycle policies and managing a separate archive that you create. You must initiate a restore for each object from Amazon S3 Glacier separately. This requires you to write a script or use a tool in order to initiate a restore for many objects collectively. You can use S3 Batch Operations to help reduce the number of individual requests, or you can use the Amazon S3 console.