Amazon S3 Glacier and S3 Glacier Deep Archive - AWS Prescriptive Guidance

Amazon S3 Glacier and S3 Glacier Deep Archive

Amazon S3 Glacier is a low-cost, cloud-archive storage service that provides secure and durable storage for data archiving and online backup. To keep costs low, Amazon S3 Glacier provides three options for access to archives, from a few minutes to several hours. S3 Glacier Deep Archive provides two access options, which range from 12 to 48 hours. With Amazon S3 Glacier, you can reliably store large or small amounts of data at significant savings compared to on-premises solutions. Amazon S3 Glacier is well suited for storage of backup data with long or indefinite retention requirements and for long-term data archiving. Amazon S3 Glacier comes in two options:

  • Amazon S3 Glacier for archiving data that might infrequently need to be restored within a few hours

  • S3 Glacier Deep Archive for archiving long-term backup cycle data that might infrequently need to be restored within 12 hours

The following table summarizes the archive retrieval options.

Storage class Expedited Standard Bulk

Amazon S3 Glacier

1–5 minutes

3–5 hours

5–12 hours

S3 Glacier Deep Archive

Not available

Within 12 hours

Within 48 hours

Using Amazon S3, you can set the storage class for each object in your S3 bucket when you create it. After the object is created, you can change the storage class by copying the object to a new object with a different storage class. Or you can enable a lifecycle configuration that will automatically change the storage class of the objects based on the rules you specify.

To automate your backup and restore processes, you can access Amazon S3 Glacier and S3 Glacier Deep Archive via the AWS Management Console, AWS CLI, and AWS SDKs. For more information, see Amazon S3 Glacier.

Using Amazon S3 Lifecycle object transition to Amazon S3 Glacier vs. managing Amazon S3 Glacier archives

Amazon S3 provides convenient transition of S3 objects into Amazon S3 Glacier storage classes, so that you can manage the lifecycle and costs for your backups. However, depending on the size of the objects and whether you must restore a collection of objects for different components in your architecture, you might want to manage this process yourself.

If you have a large number of small objects that must be restored collectively, consider the cost implications of the following options:

  • Using a lifecycle policy to automatically transition objects individually to Amazon S3 Glacier

  • Zipping objects into a single file and storing them in Amazon S3 Glacier

For each object you archive to Amazon S3 Glacier or S3 Glacier Deep Archive, Amazon S3 uses 8 KB of storage for the object name and other metadata. Amazon S3 stores this metadata so that you can get a real-time list of your archived objects by using the Amazon S3 API. You are charged S3 Standard rates for this additional storage.

For each object that is archived to Amazon S3 Glacier or S3 Glacier Deep Archive, Amazon S3 also adds 32 KB of storage for index and related metadata. This extra data is necessary to identify and restore your object. You are charged Amazon S3 Glacier or S3 Glacier Deep Archive rates for this additional storage.

By zipping your objects into a single file, you can reduce the additional storage used by Amazon S3 Glacier.

Another important consideration is that lifecycle policies are applied to objects individually. This can impact the integrity of your backup if a collection of objects must be restored collectively from a specific point in time. There is no guarantee that all objects transition at the same time even with the same expiration and lifecycle transition time set across objects. There might be a delay between when the lifecycle rule is satisfied and when the action for the rule is complete. For more information, see the AWS Knowledge Center.

Finally, consider the restoration effort between using archives from lifecycle policies and managing a separate archive that you create. You must initiate a restore for each object from Amazon S3 Glacier separately. This requires you to write a script or use a tool in order to initiate a restore for many objects collectively. You can use S3 Batch Operations to help reduce the number of individual requests.