Amazon S3 - AWS Prescriptive Guidance

Amazon S3

You can use Amazon S3 to store and retrieve any amount of data, at any time. You can use Amazon S3 as your durable store for your application data and file-level backup and restore processes. For example, you can copy your database backups from a database instance to Amazon S3 with a backup script using the AWS CLI or SDKs.

AWS services use Amazon S3 for highly durable and reliable storage, as in the following examples:

  • Amazon EC2 uses Amazon S3 to store Amazon EBS snapshots for EBS volumes and for EC2 instance stores.

  • Storage Gateway integrates with Amazon S3 to provide on-premises environments with Amazon S3–backed file shares, volumes, and tape libraries.

  • Amazon RDS uses Amazon S3 for database snapshots.

Many third-party backup solutions also use Amazon S3. For example, Arcserve Unified Data Protection supports Amazon S3 for durable backup of on-premises and cloud-native servers.

You can use the Amazon S3–integrated features of these services to simplify your backup and recovery approach. At the same time, you can benefit from the high durability and availability provided by Amazon S3.

Amazon S3 stores data as objects within resources called buckets. You can store as many objects as you want in a bucket. You can write, read, and delete objects in your bucket with fine-grained access control. Single objects can be up to 5 TB in size.

Amazon S3 offers a range of storage classes designed for different use cases, including the following classes:

  • S3 Standard for general-purpose storage of frequently accessed data (for example, configuration files, unplanned backups, daily backups).

  • S3 Standard-IA for long-lived, but less frequently accessed data (for example, monthly backups). IA stands for infrequent access.

Amazon S3 offers lifecycle policies that you can configure to manage your data throughout its lifecycle. After a policy is set, your data will be migrated to the appropriate storage class without any changes to your application. For more information, see the Amazon S3 object lifecycle management documentation.

To reduce your costs for backup, use a tiered storage class approach based on your recovery time objective (RTO) and recovery point objective (RPO), as in the following example:

  • Daily backups for the past 2 weeks using S3 Standard

  • Weekly backups for the past 3 months using S3 Standard-IA

  • Quarterly backups for the past year on S3 Glacier Flexible Retrieval

  • Yearly backups for the past 5 years on S3 Glacier Deep Archive

  • Backups deleted from S3 Glacier Deep Archive after the 5-year mark

You can automate the transition of your backups by using object lifecycle management.

Creating standard S3 buckets for backup and archive

You can create a standard S3 bucket for backup and archive with your corporation’s backup and retention policy implemented through S3 lifecycle policies. Cost allocation tagging and reporting for AWS billing is based on the tags assigned at the bucket level. If cost allocation is important, create separate backup and archive S3 buckets for each project or business unit so that you can allocate costs accordingly.

Your backup scripts and applications can use the backup and archive S3 bucket that you create to store point-in-time snapshots for application and workload data. You can create a standard s3 prefix to help you organize your point-in-time data snapshots. For example, if you create hourly backups, consider using a backup prefix such as YYYY/MM/DD/HH/<WorkloadName>/<files...>. By doing this, you can quickly retrieve your point-in-time backups manually or programmatically.

Using Amazon S3 versioning to automatically maintain rollback history

You can enable S3 object versioning to maintain a history of object changes, including the ability to revert to a previous version. This is useful for configuration files and other objects that might change more frequently than your point-in-time backup schedule. It’s also useful for files that must be reverted individually.

Using Amazon S3 to back up and recover customized configuration files for AMIs

Amazon S3 with object versioning can become your system of record for your workload configuration and option files. For example, you might use a standard AWS Marketplace Amazon EC2 image that is maintained by an ISV. This image might contain software whose configuration is maintained in a number of configuration files. You can maintain your customized configuration files in Amazon S3. When your instance is launched, you can copy these configuration files to your instance as a part of your instance user data. When you apply this approach, you don’t need to customize and recreate an AMI to use an updated version.

Using Amazon S3 in your custom backup and restore process

Amazon S3 provides a general-purpose backup store that you can quickly integrate into your existing custom backup processes. You can use the AWS CLI, AWS SDKs, and API operations to integrate your backup and restore scripts and processes that use Amazon S3. For example, you might have a database backup script that performs nightly database exports. You can customize this script to copy your nightly backups to Amazon S3 for offsite storage. See the Batch upload files to the cloud tutorial for an overview of how to do this.

You can take a similar approach for exporting and backing up data for different applications based on their individual RPO. Additionally, you can use AWS Systems Manager to run your backup scripts on your managed instances. Systems Manager provides automation, access control, scheduling, logging, and notification for your individual backup processes.