Cost - Amazon S3 Glacier Re:Freezer

Cost

You are responsible for the cost of the AWS services used while running this solution. Estimated costs vary based on the number of archives processed and total volume of data to copy from an Amazon S3 Glacier vault.

Table 1 details the estimated costs for two scenarios of running this solution using the default settings in the US East (N. Virginia) Region and Amazon S3 Glacier Deep Archive as the destination S3 storage class.

Scenario 1 is the estimated cost of running the solution to copy 100,000 Amazon S3 Glacier vault archives, totaling 100TB of data, from an Amazon S3 Glacier vault to the destination Amazon S3 bucket and S3 Glacier Deep Archive storage class.

Scenario 2 is the estimated cost of running this solution to copy 10,000,000 Glacier vault archives, totaling 100TB of data, from an Amazon S3 Glacier vault to the destination Amazon S3 bucket and S3 Glacier Deep Archive storage class.

Scenario # of Glacier vault archives Size of vault Estimated cost (USD)
1 100,000 100TB $427.00
2 10,000,000 100TB $1,558.00

Table 1: Estimated cost (USD) in as of March 2021 in US East (N. Virginia)

The estimated costs in Table 1 include charges for AWS Lambda, AWS Step Functions, Amazon DynamoDB, Amazon SNS, Amazon SQS, Amazon S3, Amazon S3 Glacier, AWS IAM, Amazon Athena, AWS Glue, and Amazon CloudWatch. Prices are subject to change. For full details, refer to the pricing webpage for each AWS service used in this solution.

Table 2 is a cost calculator. Use it to estimate the cost of using this solution to copy your Amazon S3 Glacier vault archives to Amazon S3 Glacier Deep Archive.

Note

The unit cost values shown in the table for items 1-3 are based on US East (N. Virginia) Region pricing.

  1. Based on your Region, enter the AWS unit cost for items 1-3 in column [A] - Unit cost.

  2. Based on your source Glacier vault details, replace the <Size of Vault in gigabytes> and <# of Glacier Vault archives> with your values in column [B] - Your value.

  3. For items 1-5, follow the formula listed in column [C] - Estimated cost to calculate the estimated cost of each line item.

  4. The total estimated cost to use the solution is the sum of all the values in the [C] - Estimated cost column.

Item Solution component Type [A] - Unit cost [B] - Your value [C] - Estimated cost
1 Amazon S3 Glacier - Data Retrievals (Bulk) Per GB $0.0025 <Size of Vault in gigabytes > [A] * [B]
2 Amazon S3 Glacier - Retrieval Requests (Bulk) Per 1,000 requests $0.025 <# of Glacier Vault archives> [A] * [B] / 1000
3 Amazon S3 Glacier Deep Archive - API Requests (PUT) Per 1,000 requests $0.050 <# of Glacier Vault archives> [A] * [B] / 1000
4 Solution Runtime Per GB $0.00165 <Size of Vault in gigabytes> [A] * [B]
5 Solution Runtime Per 1,000 requests $0.0450 <# of Glacier Vault archives> [A] * [B] / 1000

Table 2: Cost estimate equation

Cost consideration: S3 Glacier vault archive size

The two most common use cases for implementing the Amazon S3 Glacier Re:Freezer solution for copying S3 Glacier vault archives to an Amazon S3 bucket are:

  1. You want to do more with your data, for example, using your archive data to seed your data lake, or for a specific business requirement.

  2. You want to save on the costs of your archived data by copying your S3 Glacier vault archives to Amazon S3 Glacier Deep Archive.

Important

This solution will not provide cost savings for the following use cases:

  • If the average archive size is less than 4MB, and/or

  • After the S3 Glacier vault archives are copied to the S3 Glacier Deep Archive storage class, the copied data is retained for less than 12 months.

This is because the major cost component of using this solution comes from the Amazon S3 Glacier retrieval API charges, and Amazon S3 Deep Archive upload charges. A large number of small archives will generate more API calls than having a few large archives. This impacts the time to realize the cost benefits of using S3 Glacier Deep Archive when factoring in the costs for running the Amazon S3 Glacier Re:Freezer solution.

For example, working with a large average archive size of 20 MB could take approximately 4 months to begin to realize cost savings (of using S3 Glacier Deep Archive compared to S3 Glacier vault) after factoring in the one-time cost to run the Amazon S3 Glacier Re:Freezer solution. However, working with a smaller average archive size of 4 MB could take more than 12 months to begin to realize the cost savings (of using S3 Glacier Deep Archive compared to S3 Glacier vault) after factoring in the costs of running the solution.

Note

You can use the formula to estimate your average archive size in MB and help you determine whether deploying this solution is viable for your use case:

Average size (MB) = (Size-of-Vault-in-gigabytes / Number-of-Glacier-Vault-archives) * 1024­

Cost consideration: Number of AWS CloudTrails

You can use AWS CloudTrail to log, continuously monitor, and retain account activity related to actions across your AWS infrastructure. When you create additional trails, for example, to capture data or insight on generated events, AWS CloudTrail charges will apply.

Before you deploy the the Amazon S3 Glacier Re:Freezer solution, you must validate the number of CloudTrails that have been configured to capture management events. By default, the newly created CloudTrails enable management event collection, even though the management events are already processed by the first (and free) trail:

Figure 1: Example trail log events

As of March 2021, Amazon S3 Glacier API calls are classified as management events and additional copies of management events are charged at $2 per 100,000 events.

For example, if there are 3 CloudTrails configured, and you make 100,000 API calls to Amazon S3 Glacier, the additional charge will be $4. The 100,000 management events captured by the first CloudTrail will be free, however the second and the third copy of the same events captured by the other two CloudTrail’s will be charged at $2 each.

Any cost impact of running Amazon S3 Glacier Re:Freezer where you have multiple CloudTrails configured will be more evident for Amazon S3 Glacier vaults containing a very large number of archives, as the solution makes multiple API calls per archive.

You can estimate additional charges if situations where you have more than one CloudTrail configured using the procedure below:

  1. Management events count = ((Vault Size in gigabytes) * 0.25) + ((Number of Archives) x 2)

  2. AWS CloudTrail charges = $2 * (Management events count ) / 100000

Then, multiply the total from item 2 by the number of additional CloudTrails configured to capture the management event activity in the account.

For example, for a 300 terabyte (TB) vault with 10 million archives, each additional CloudTrail with management events enabled, will result in:

Management Events Count = (300000 * 0.25 ) + ( 2 * 10000000) = 20075000

AWS CloudTrail charges for each additional trail = $2 * 20075000 / 100000 = $401.5

Using this scenario (300TB vault, 10 million archives), if we estimate the Amazon S3 Glacier Re:Freezer solution cost to be approx. $3,645, each additional CloudTrail capturing management events will increase that cost estimate by approximately 11%.

Note

You must validate your AWS CloudTrail configuration to check if there is more than one CloudTrail configured to capture management event before you deploy the Amazon S3 Glacier Re:Freezer solution. Consider whether there is a practical need to capture copies of the management events in additional CloudTrails on top of the first (and free) trail, and verify whether the management events capture can be switched off in the additional CloudTrails for the time you are running the solution.