Backup and recovery from on-premises infrastructure to AWS - AWS Prescriptive Guidance

Backup and recovery from on-premises infrastructure to AWS

You can use AWS for durable, offsite storage of your on-premises infrastructure backups. By using AWS storage services in this scenario, you can focus on backup and archiving tasks. You don’t have to worry about storage infrastructure provisioning, scaling, or infrastructure capacity for your backup tasks.

Amazon S3 and Amazon S3 Glacier provide extensive API operations and SDKs for integrating these services into your new and existing backup and recovery approaches. This also gives backup software vendors ways to directly integrate their applications with AWS storage solutions.

In this scenario, backup and archive software that you are using in your on-premises infrastructure directly interfaces with AWS through the API operations. Because the backup software is AWS-aware, it backs up the data from the on-premises servers directly to Amazon S3 or Amazon S3 Glacier.

If your existing backup software does not natively support the AWS Cloud, you can use Storage Gateway. A cloud storage service, Storage Gateway gives your on-premises systems access to scalable cloud storage. It supports open standard storage protocols that work with your existing applications while securely storing your data encrypted in Amazon S3 or Amazon S3 Glacier. You can use Storage Gateway as a part of a backup and recovery approach for your on-premises block-based storage workloads.

Storage Gateway is helpful in hybrid scenarios where you want to transition to cloud-based storage for your backups. Storage Gateway also helps you reduce capital investments in on-premises storage. You deploy Storage Gateway as a VM or a dedicated hardware appliance. This guide focuses on how Storage Gateway applies to backup and recovery.

Storage Gateway provides three different options to satisfy different requirements:

  • A file gateway for storing application data files and backup images as durable objects on Amazon S3 cloud storage using SMB-based or NFS-based access.

  • A volume gateway for presenting cloud-based iSCSI block storage volumes to your on-premises applications. A volume gateway provides either a local cache or full volumes on premises while also storing full copies of your volumes in the AWS Cloud.

  • A tape gateway for pointing trusted backup software at an on-premises storage gateway that, in turn, connects to Amazon S3 and Amazon S3 Glacier. This option delivers the scale and durability of the cloud for safe, long-term retention without disrupting existing investments or processes.

File gateway

Many organizations start their cloud journey by moving secondary and tertiary data, such as backups, to the cloud. A file gateway’s SMB and NFS interface support provides a way for IT groups to transition backup jobs from existing on-premises backup systems to the cloud. Backup applications, native database tools, or scripts that can write to SMB or NFS can write to a file gateway. The file gateway stores the backups as Amazon S3 objects of up to 5 TiB in size. With an adequately sized local cache, recent backups can be used for fast on-site recoveries. Long-term retention needs are addressed by tiering backups to low-cost S3 Standard-Infrequent Access and Amazon S3 Glacier storage tiers.

File gateway provides an on ramp for your block-based storage to Amazon S3 for highly durable offsite backups. It is especially useful for scenarios in which a recently backed up file must be restored quickly. Because a file gateway supports the SMB and NFS protocols, users can access files the same way they would access a network file share. You can also take advantage of Amazon S3 object versioning capabilities. Using object versioning, you can restore previous object versions for a file and then easily access them by using SMB or NFS.

Volume gateway

A volume gateway enables you to provision cloud-based iSCSI block storage volumes for your on-premises servers. The volume gateway stores your volume data to Amazon S3 for durable, scalable cloud-based offsite storage. A volume gateway facilitates taking full point-in-time snapshots of your volumes and storing them in the cloud as Amazon EBS snapshots. After they are stored as snapshots, whole volumes can be restored as EBS volumes and attached to EC2 instances, accelerating a cloud-based DR solution. The volumes can also be restored to Storage Gateway, enabling your on-premises applications to revert back to a previous state.


                Diagram of application servers and an on-premises host with a Storage Gateway
                    virtual machine communicating through SSL to Storage Gateway on AWS, with Amazon S3, Amazon EC2
                    and Amazon EBS.

Because a volume gateway integrates with the Amazon EBS volume feature of Amazon EC2, you can use AWS Backup to automate and schedule your snapshot process. A volume gateway provides you with the added benefits of durable, Amazon S3–backed Amazon EBS snapshots and tagging features. For more information, see the Amazon EBS snapshot documentation.

Tape gateway

A tape gateway offers the high durability, low-cost tiered storage, and extensive features of Amazon S3 and Amazon S3 Glacier for your offsite virtual tape backup store. All your virtual tapes stored in Amazon S3 and Amazon S3 Glacier are replicated and stored across at least three geographically dispersed Availability Zones. Your virtual tapes are protected by 11 nines of durability.

AWS also performs fixity checks on a regular basis to confirm that your data can be read and that no errors have been introduced. All tapes stored in Amazon S3 are protected by server-side encryption using default keys or your AWS KMS keys. In addition, you avoid physical security risk associated with tape portability. With a tape gateway, you get correct data, compared to offsite warehousing of tapes, where you might receive an incorrect or broken tape during restore.

You can save on monthly storage costs when storing your data in Amazon S3. You can save even more for your long-term archival requirements by using S3 Glacier Deep Archive.


                Diagram of an on-premises tape gateway and a tape library and tape shelf on
                    AWS

A tape gateway acts as a virtual tape library (VTL) that spans from your on-premises environment to highly scalable, redundant, and durable storage services: Amazon S3, S3 Glacier Flexible Retrieval, and S3 Glacier Deep Archive.

The tape gateway presents Storage Gateway to your existing backup application as an open standard iSCSI-based VTL, with a virtual media changer and virtual tape drives. You can continue to use your existing backup applications and workflows while writing to a collection of virtual tapes stored on massively scalable Amazon S3. When you no longer require immediate or frequent access to the data on a virtual tape, your backup application can archive it into S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive, further reducing storage costs.

You can retrieve a tape that is archived in S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive typically in 3–5 hours or 12 hours, respectively. The tape gateway can be used with a backup application that is compatible with the iSCSI-based tape library interface for accessing the virtual tapes. Also consider the minimum 100-GB storage size per tape. For more information, review the list of third-party backup applications that support tape gateways.