Optimize Amazon S3 Storage - AWS Storage Optimization

Optimize Amazon S3 Storage

Amazon S3 lets you analyze data access patterns, create inventory lists, and configure lifecycle policies. You can set up rules to automatically move data objects to cheaper S3 storage tiers as objects are accessed less frequently or to automatically delete objects after an expiration date. To manage storage data most effectively, you can use tagging to categorize your S3 objects and filter on these tags in your data lifecycle policies.

To determine when to transition data to another storage class, you can use Amazon S3 analytics storage class analysis to analyze storage access patterns. Analyze all the objects in a bucket or use an object tag or common prefix to filter objects for analysis. If you observe infrequent access patterns of a filtered data set over time, you can use the information to choose a more appropriate storage class, improve lifecycle policies, and make predictions around future usage and growth.

Another management tool is Amazon S3 Inventory, which audits and reports on the replication and encryption status of your S3 objects on a weekly or monthly basis. This feature provides CSV output files that list objects and their corresponding metadata and lets you configure multiple inventory lists for a single bucket, organized by different S3 metadata tags. You can also query Amazon S3 inventory using standard SQL by using Amazon Athena, Amazon Redshift Spectrum, and other tools, such as Presto, Apache Hive, and Apace Spark.

Amazon S3 can also publish storage, request, and data transfer metrics to Amazon CloudWatch. Storage metrics are reported daily, are available at one-minute intervals for granular visibility, and can be collected and reported for an entire bucket or a subset of objects (selected via prefix or tags).

With all the information these storage management tools provide, you can create policies to move less-frequently-accessed data S3 data to cheaper storage tiers for considerable savings. For example, by moving data from Amazon S3 Standard to Amazon S3 Standard-IA, you can save up to 60% (on a per-gigabyte basis) of Amazon S3 pricing. By moving data that is at the end of its lifecycle and accessed on rare occasions to Amazon Glacier, you can save up to 80% of Amazon S3 pricing.

The following table compares the monthly cost of storing 1 petabyte of content on Amazon S3 Standard versus Amazon S3 Standard - IA (the cost includes the content retrieval fee). It demonstrates that if 10% of the content is accessed per month, the savings would be 41% with Amazon S3 Standard - IA. If 50% of the content is accessed, the savings would be 24%—which is still significant. Even if 100% of the content is accessed per month, you would still save 2% using Amazon S3 Standard - IA.

Comparing 1 Petabyte of Object Storage (Based on US East Prices)
1 PB Monthly Content Accessed Per Month S3 Standard S3 Standard - IA Savings
1 PB Monthly 10% $24,117 $14,116 41%
1 PB Monthly 50% $24,117 $18,350 24%
1 PB Monthly 100% $24,117 $23,593 2%

There is no charge for transferring data between Amazon S3 storage options as long as they are within the same AWS Region.

To further optimize costs associated to storage and data retrieval, AWS announced the launch of Amazon S3 Select and Amazon S3 Glacier Select. Traditionally, data in object storage had to be accessed as whole entities, regardless of the size of the object. Amazon S3 Select now lets you retrieve a subset of data from an object using simple SQL expressions, which means that your applications no longer have to use compute resources to scan and filter the data from an object. Using Amazon S3 Select, you can potentially improve query performance by up to 400% and reduce query costs as much as 80%. AWS also supports efficient data retrieval with Amazon S3 Glacier so that you do not have to restore an archived object to find the bytes needed for analytics. With both Amazon S3 Select and Amazon S3 Glacier Select, you can lower your costs and uncover more insights from your data, regardless of what storage tier it’s in.