Amazon S3 Glacier
Amazon S3 Glacier is a low-cost, cloud-archive storage service that provides secure and durable storage for data archiving and online backup. To keep costs low, S3 Glacier provides three storage classes from a few milliseconds to hours. S3 Glacier Flexible Retrieval and S3 Glacier Deep Archive provide additional options based on how quickly you need to restore the data. With S3 Glacier, you can reliably store large or small amounts of data at significant savings compared to on-premises solutions. S3 Glacier is well suited for storage of backup data with long or indefinite retention requirements and for long-term data archiving. S3 Glacier provides the following storage classes:
-
S3 Glacier Instant Retrieval for archiving data that might be needed once per quarter and needs to be restored quickly (milliseconds)
-
S3 Glacier Flexible Retrieval for archiving data that might infrequently need to be restored, once or twice per year, within a few hours
-
S3 Glacier Deep Archive for archiving long-term backup cycle data that might infrequently need to be restored within 12 hours
The following table summarizes the archive retrieval options.
Storage class | Expedited | Standard | Bulk |
---|---|---|---|
S3 Glacier Instant Retrieval |
Not applicable |
Not applicable |
Not applicable |
S3 Glacier Flexible Retrieval |
1–5 minutes |
3–5 hours |
5–12 hours |
S3 Glacier Deep Archive |
Not available |
Within 12 hours |
Within 48 hours |
Using Amazon S3, you can set the storage class for each object in your S3 bucket when you create it. After the object is created, you can change the storage class by copying the object to a new object with a different storage class. Or you can enable a lifecycle configuration that will automatically change the storage class of the objects based on the rules you specify.
To automate your backup and restore processes, you can access Amazon S3 Glacier and S3 Glacier Deep Archive via the AWS Management Console, AWS CLI, and AWS SDKs. For more information, see Amazon S3 Glacier.
Using Amazon S3 Lifecycle object transition to Amazon S3 Glacier compared with managing Amazon S3 Glacier archives
Amazon S3 provides convenient transition of S3 objects into Amazon S3 Glacier storage classes, so that you can manage the lifecycle and costs for your backups. However, depending on the size of the objects and whether you must restore a collection of objects for different components in your architecture, you might want to manage this process yourself.
If you have a large number of small objects that must be restored collectively, consider the cost implications of the following options:
-
Using a lifecycle policy to automatically transition objects individually to Amazon S3 Glacier
-
Zipping objects into a single file and storing them in Amazon S3 Glacier
Amazon S3 Glacier has minimum capacity charges for each object depending on the storage class you
use. For example, S3 Glacier Instant Retrieval has a minimum capacity charge of 128 KB for
each object. See the performance
chart
For each object that you archive to S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive, Amazon S3 uses 8 KB of storage for the object name and other metadata. Amazon S3 stores this metadata so that you can get a real-time list of your archived objects by using the Amazon S3 API. You are charged S3 Standard rates for this additional storage.
Amazon S3 also adds 32 KB of storage for index and related metadata for each object that is archived to S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive storage classes. This extra data is necessary to identify and restore your object. You are charged Amazon S3 Glacier or S3 Glacier Deep Archive rates for this additional storage.
By zipping your objects into a single file, you can reduce the additional storage used by Amazon S3 Glacier as well as avoid minimum capacity charges for many small objects..
Another important consideration is that lifecycle policies are applied to objects
individually. This can impact the integrity of your backup if a collection of objects must
be restored collectively from a specific point in time. There is no guarantee that all
objects transition at the same time even with the same expiration and lifecycle transition
time set across objects. There might be a delay between when the lifecycle rule is satisfied
and when the action for the rule is complete. For more information, see the AWS
Knowledge Center
Finally, consider the restoration effort between using archives from lifecycle policies and managing a separate archive that you create. You must initiate a restore for each object from Amazon S3 Glacier separately. This requires you to write a script or use a tool in order to initiate a restore for many objects collectively. You can use S3 Batch Operations to help reduce the number of individual requests, or you can use the Amazon S3 console.