Menu
EFS-to-EFS Backup Solution
EFS-to-EFS Backup Solution

Design Considerations

Incremental Backups

This solution captures the state of an Amazon EFS file system at a point in time. If you specify a backup window long enough to copy your entire file system, the solution will copy the entire file system or the entire subdirectory the first time it creates a backup. When the solution creates future backups for that file system, it copies only the files and directories that have changed, or been added or removed since the last backup.

We recommend that you launch the solution for the first time with a large instance type, and a backup window large enough to copy your entire file system. After the first backup completes, update the running stack with a smaller instance type and backup window to reduce costs and save burst credits.

Consistent Backups

The EFS-to-EFS backup solution uses fpsync, a tool that synchronizes directories in parallel using fpart (sorts and packs files into partitions) and rsync (a file-copying tool), to copy the source file system to the backup file system. Note that this solution might exclude any data written while fpsync or the backup process are running. To ensure consistent backups, we recommend that you do not perform writes on the source Amazon Elastic File System (Amazon EFS) file system for the duration of the backup process. Note that after the solution creates the initial backup, future backups are incremental and will take less time to complete.

Sizing and Capacity

This solution uses a single Amazon Elastic Compute Cloud (Amazon EC2) instance. The maximum throughput you can drive per NFS client on an Amazon EC2 instance is 250 MB/s. All Amazon EFS file systems, regardless of size, can burst to 100 MiB/s of throughput, and those over 1 TiB large can burst to 100 MiB/s per TiB of data stored in the file system. The size of the file system determines the portion of time a file system can burst. Amazon EFS uses a credit system to determine when file systems can burst. For more information about how the credit system works, see Amazon EFS Performance in the Amazon EFS User Guide.

Large file systems might cause this solution to hit the Amazon EC2 instance throughput limit. Customers who want to back up large file systems can launch multiple deployments of the solution with different source prefixes that point to different locations in the source file system. For example, a customer with a large Amazon EFS file system that contains a home directory and an appdata directory can deploy two solution stacks, one that points to the home directory (<efs-mount-point>:/home) and one that points to the appdata directory (<efs-mount-point>:/appdata). Customers can also launch multiple deployments of the solution to back up multiple Amazon EFS file systems in an AWS Region.

We recommend that you use a rigorous performance testing and optimization process to choose the right instance type for your use case. For more information, see Appendix A.

Custom Sizing

The EFS-to-EFS backup solution offers three preset Amazon EC2 instance sizes to support your anticipated throughput needs:

  • Small: c4.large instance type with approximately 70MiB/s throughput

  • Medium: c4.xlarge instance type with approximately 100MiB/s throughput

  • Large: r4.xlarge instance type with approximately 130MiB/s throughput

For more information, see Appendix A.

Burst Credits

This solution will consume burst credits while creating backups, which could impact your production workload. We recommend that you verify that you have sufficient burst credits available before you start the backup process. You can also change the solution’s default Amazon EC2 instance type to change how the solution consumes burst credits. For more information, see Appendix A.

Granular Backups

TThe EFS-to-EFS backup solution enables customers to specify any valid directory from their source file system. To back up part of an Amazon EFS file system, customers can specify the applicable subdirectory as the source and the solution will copy only files and directories from that subdirectory. Continuing from the previous example (see Sizing and CapacityCustom Sizing), a customer can specify the appdata directory as the source and the solution copies only the appdata directory to the backup file system.

Encryption

By default, this solution leverages Amazon EFS encryption so you can encrypt your backups at rest. Amazon EFS integrates with AWS Key Management Service (AWS KMS) customer master keys (CMKs), and uses an industry-standard AES-256 encryption algorithm to encrypt EFS data and metadata. For more information, see Cryptography Basics in the AWS Key Management Service Developer Guide.

Visualization

The EFS-to-EFS backup solution includes optional dashboards that allow you to visualize Amazon EFS I/O data during the backup and restore processes for both the source and backup Amazon EFS file systems. To view the dashboards, open the Amazon CloudWatch console and select Dashboards.

Regional Deployment

This solution uses the Amazon EFS service, which is currently available in specific AWS Regions only. Therefore, you must launch this solution in an AWS Region where Amazon EFS is available. (For the most current Amazon EFS availability by region, see https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/.) Also, you must deploy this solution in the same AWS Region as your source Amazon EFS file system. You can launch multiple deployments of the solution in a single AWS Region to back up multiple Amazon EFS file systems in that region.