Using Amazon S3 storage classes Creating standard S3 buckets Using Amazon S3 versioning Backing up and recovering customized configuration files for AMIs Custom backup and restore Securing backup data

Backup and recovery using Amazon S3

You can use Amazon Simple Storage Service (Amazon S3) to store and retrieve any amount of data, at any time. You can use Amazon S3 as your durable store for your application data and file-level backup and restore processes. For example, you can copy your database backups from a database instance to Amazon S3 with a backup script using the AWS CLI or AWS SDKs.

AWS services use Amazon S3 for highly durable and reliable storage, as in the following examples:

Amazon EC2 uses Amazon S3 to store Amazon EBS snapshots for EBS volumes and for EC2 instance stores.
Storage Gateway integrates with Amazon S3 to provide on-premises environments with Amazon S3 backed file shares, volumes, and tape libraries.
Amazon RDS uses Amazon S3 for database snapshots.

Many third-party backup solutions also use Amazon S3. For example, Arcserve Unified Data Protection supports Amazon S3 for durable backup of on-premises and cloud-native servers.

You can use the Amazon S3 integrated features of these services to simplify your backup and recovery approach. At the same time, you can benefit from the high durability and availability provided by Amazon S3.

Amazon S3 stores data as objects within resources called buckets. You can store as many objects as you want in a bucket. You can write, read, and delete objects in your bucket with fine-grained access control. Single objects can be up to 5 TB in size.

Using Amazon S3 storage classes to reduce backup data storage costs

Amazon S3 offers multiple storage classes for use in on-premises, hybrid, and cloud-native architectures. All storage classes provide scalable capacity that requires no volume or media management as your backup datasets grow. The pay-for-what-you-use model and low cost per GB/month make Amazon S3 storage classes a fit for a broad range of data-protection use cases. Amazon S3 storage classes are designed for different use cases, including the following categories:

Frequent access storage classes for general-purpose storage of frequently accessed data (for example, configuration files, unplanned backups, daily backups). This includes the S3 Standard storage class, which is the default for all Amazon S3 objects.
Infrequent access storage classes for long-lived, but infrequently accessed data (for example, monthly backups). This includes the S3 Standard-IA storage class. IA stands for infrequent access.
S3 Glacier storage classes for extremely long-lived data that rarely needs to be accessed (for example, yearly backups). This includes S3 Glacier Deep Archive, which provides the lowest-cost storage on AWS.

For backups with unknown or changing access patterns, you can use the S3 Intelligent-Tiering storage class. S3 Intelligent-Tiering automatically transitions objects to the most cost-effective tier based on how many days ago an object was last accessed.

Note

Some storage classes have a minimum duration charge. For details, see Amazon S3 pricing, and use the web page search to find duration.

Amazon S3 offers lifecycle policies that you can configure to manage your data throughout its lifecycle. After a policy is set, your data will be automatically migrated to the appropriate storage class without any changes to your application. For more information, see the Amazon S3 object lifecycle management documentation.

To reduce your costs for backup, use a tiered storage class approach based on your recovery time objective (RTO) and recovery point objective (RPO), as in the following example:

Daily backups for the past 2 weeks using S3 Standard
Weekly backups for the past 3 months using S3 Standard-IA
Quarterly backups for the past year on S3 Glacier Flexible Retrieval
Yearly backups for the past 5 years on S3 Glacier Deep Archive
Backups deleted from S3 Glacier Deep Archive after the 5-year mark

Creating standard S3 buckets for backup and archive

You can create a standard S3 bucket for backup and archive with your corporation’s backup and retention policy implemented through S3 lifecycle policies. Cost allocation tagging and reporting for AWS billing is based on the tags assigned at the bucket level. If cost allocation is important, create separate backup and archive S3 buckets for each project or business unit so that you can allocate costs accordingly.

Your backup scripts and applications can use the backup and archive S3 bucket that you create to store point-in-time snapshots for application and workload data. You can create a standard S3 prefix to help you organize your point-in-time data snapshots. For example, if you create hourly backups, consider using a backup prefix such as YYYY/MM/DD/HH/<WorkloadName>/<files...>. By doing this, you can quickly retrieve your point-in-time backups manually or programmatically.

Using Amazon S3 versioning to automatically maintain rollback history

You can enable S3 object versioning to maintain a history of object changes, including the ability to revert to a previous version. This is useful for configuration files and other objects that might change more frequently than your point-in-time backup schedule. It’s also useful for files that must be reverted individually.

Using Amazon S3 to back up and recover customized configuration files for AMIs

Amazon S3 with object versioning can become your system of record for your workload configuration and option files. For example, you might use a standard AWS Marketplace Amazon EC2 image that is maintained by an ISV. This image might contain software whose configuration is maintained in a number of configuration files. You can maintain your customized configuration files in Amazon S3. When your instance is launched, you can copy these configuration files to your instance as a part of your instance user data. When you apply this approach, you don’t need to customize and recreate an AMI to use an updated version.

Using Amazon S3 in your custom backup and restore process

Amazon S3 provides a general-purpose backup store that you can quickly integrate into your existing custom backup processes. You can use the AWS CLI, AWS SDKs, and API operations to integrate your backup and restore scripts and processes that use Amazon S3. For example, you might have a database backup script that performs nightly database exports. You can customize this script to copy your nightly backups to Amazon S3 for offsite storage. See the Batch upload files to the cloud tutorial for an overview of how to do this.

You can take a similar approach for exporting and backing up data for different applications based on their individual RPO. Additionally, you can use AWS Systems Manager to run your backup scripts on your managed instances. Systems Manager provides automation, access control, scheduling, logging, and notification for your individual backup processes.

Securing backup data in Amazon S3

Data security is a universal concern, and AWS takes security very seriously. Security is the foundation of every AWS service. Amazon S3 provides capabilities for access control and encryption both at rest and in transit. All Amazon S3 endpoints support SSL/TLS for encrypting data in transit. You can set up encryption for objects at rest by doing the following:

Using server-side encryption with Amazon S3 managed encryption keys (default)
Using server-side encryption with AWS Key Management Service (AWS KMS) keys stored in AWS KMS
Using client-side encryption

You can use AWS Identity and Access Management (IAM) to control access to S3 objects. IAM provides control over permissions for individual objects and specific prefix paths within an S3 bucket. You can audit access to S3 objects by using object-level logging with AWS CloudTrail.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

AWS Backup

Amazon EC2 with EBS volumes