Availability and durability - FSx for OpenZFS

Availability and durability

FSx for OpenZFS supports two file system deployment types, Single-AZ and Multi-AZ, that offer different levels of availability and durability. The following sections provide information to help you choose the right deployment type for your workloads. For information on the service's availability SLA (Service Level Agreement), see Amazon FSx Service Level Agreement.

Choosing between Single-AZ and Multi-AZ deployment types

Single-AZ file systems are composed of a single file server instance and a set of storage volumes within a single Availability Zone (AZ). Amazon FSx continuously monitors for hardware failures, and automatically recovers from failure events by replacing the failed infrastructure component. Single-AZ file systems are offline—typically for less than 20 minutes—during these failure recovery events, and during the planned file system maintenance window that you configure for your file system. For Single-AZ file systems, file system failure may be unrecoverable in rare cases. For example, when there are multiple component failures. In these cases, you can recover your file system from the most recent backup.

Multi-AZ file systems are composed of a high-availability (HA) pair of file servers spread across two Availability Zones (a preferred AZ and a standby AZ) and a set of storage volumes on each of the two Availability Zones. Data is replicated synchronously as it is written within each individual Availability Zone and between the two Availability Zones. Relative to Single-AZ deployment, Multi-AZ deployments provide enhanced durability by further replicating data across Availability Zones, and enhanced availability by automatically failing over to the standby AZ during planned system maintenance, and in cases of unplanned service disruption. This allows you to continue accessing your data, and helps to protect your data against instance failure and AZ disruption.

We recommend using Multi-AZ file systems for most production workloads, given the high availability and durability model they provide. Single-AZ deployment is designed as a cost-efficient solution for test and development workloads, production workloads that don't require additional storage-level redundancy, and production workloads that have relaxed availability and Recovery Point Objective (RPO) needs. Workloads with relaxed availability and RPO needs can tolerate temporary loss of availability for up to 20 minutes in the event of planned file system maintenance or unplanned service disruption and, in rare cases, the loss of data updates since the most recent backup.

Deployment type availability

Amazon FSx for OpenZFS is available in the following AWS Regions, depending on your deployment type:

AWS Region Deployment Type

Single-AZ 1

Single-AZ 2

Multi-AZ

US East (N. Virginia)*
US East (Ohio)
US West (N. California)
US West (Oregon)*
AWS GovCloud (US-West)
AWS GovCloud (US-East)
Asia Pacific (Hong Kong)
Asia Pacific (Tokyo)
Asia Pacific (Seoul)
Asia Pacific (Osaka)
Asia Pacific (Singapore)*
Asia Pacific (Sydney)
Asia Pacific (Jakarta)
Asia Pacific (Mumbai)
Asia Pacific (Hyderabad)
Canada (Central)*
Europe (Milan)
Europe (Spain)
Europe (Frankfurt)
Europe (Zurich)
Europe (Ireland)
Europe (London)
Europe (Paris)
Europe (Stockholm)
Middle East (UAE)
Middle East (Bahrain)
South America (São Paulo)
Israel (Tel Aviv)
Africa (Cape Town)
Note

*Due to differences in infrastructure capabilities and configurations, your file system type may be unavailable in specific AZs within these regions.

Failover process for FSx for OpenZFS

Multi-AZ file systems automatically fail over from the preferred file server to the standby file server under the following conditions:

  • The preferred file server becomes unavailable.

  • The file system's throughput capacity is changed.

  • The preferred file server undergoes planned maintenance.

  • An Availability Zone disruption occurs.

When failing over from one file server to another, the new active file server automatically begins serving all file system read and write requests. For Multi-AZ file systems, when the preferred file server is fully recovered and becomes available, Amazon FSx automatically fails back to it, with failback usually taking less than 60 seconds. A failover typically takes less than 60 seconds from the detection of the failure on the active file server to the promotion of the standby file server to active status. Upon completion of the failover, you continue to have access to your data without manual intervention.

Testing failover on a Multi-AZ file system

You can test failover on your Multi-AZ file system by modifying its throughput capacity. When you modify your file system's throughput capacity, Amazon FSx switches out the file system's file servers sequentially. File systems automatically fail over to the secondary server while Amazon FSx replaces the preferred file server first. After the update, the file system automatically fails back to the new primary server and Amazon FSx replaces the secondary file server.

You can monitor the progress of the throughput capacity update request in the Amazon FSx console, the CLI, and the API. For more information about modifying your file system's throughput capacity and monitoring the progress of the request, see Managing throughput capacity.

Working with file system resources

Subnets

When you create a VPC, it spans all the Availability Zones (AZs) in the region. AZs are distinct locations that are engineered to be isolated from failures in other AZs. After creating a VPC, you can add one or more subnets in each AZ. The default VPC has a subnet in each AZ. Each subnet must reside entirely within one AZ and cannot span zones. When you create a Single-AZ Amazon FSx file system, you specify a single subnet for the file system. The subnet you choose defines the AZ in which the file system is created.

When you create a Multi-AZ file system, you specify two subnets, one for the preferred file server, and one for the standby file server. The two subnets you choose must be in different Availability Zones within the same AWS Region. For more information about Amazon VPC, see What is Amazon VPC? in the Amazon VPC user guide.

For in-AWS applications, we recommend that you launch your clients in the same Availability Zone as your preferred file server to minimize latency.

File system elastic network interfaces

For Single-AZ file systems, Amazon FSx provisions one elastic network interface (ENI) in the subnet that you associate with your file system. For Multi-AZ file systems, Amazon FSx provisions two ENIs—one in each of the subnets that you associate with your file system. Clients communicate with your Amazon FSx file system using the elastic network interface that's attached to the file server that serves the data. Network interfaces are considered to be within the service scope of Amazon FSx, despite being part of your account's VPC.

Warning

You must not modify or delete the elastic network interfaces associated with your file system. Modifying or deleting the network interface can cause a permanent loss of connection between your VPC and your file system.

The following table summarizes the subnet, elastic network interface, and IP address resources for FSx for OpenZFS file system deployment types:

File system deployment type Number of subnets Number of elastic network interfaces Number of IP addresses
Multi-AZ 2 2 3
Single-AZ 1 1 1

Once a file system is created, its IP addresses don't change until the file system is deleted. For Multi-AZ file systems, the number of IP addresses includes a floating IP address, which allows connected clients to transition between the preferred and standby file servers during a failover event. For more information, see Accessing data.

Important

Amazon FSx doesn't support accessing file systems from, or exposing file systems to the public Internet. If an Elastic IP address, which is a public IP address reachable from the Internet, is attached to a file system's elastic network interface, Amazon FSx automatically detaches it.

Backups

FSx for OpenZFS offers a native backups feature that's designed to support archival, data retention, and compliance needs. A backup is a secondary, offline copy of your file system. Amazon FSx backups are crash-consistent and incremental, which means that only the changes from your most recent backup are saved. This saves on backup storage costs by not duplicating data. By default, Amazon FSx takes an automatic daily backup of your file system during a backup window that you specify. You can create additional backups at any time using the AWS Management Console, AWS Command Line Interface, or Amazon FSx API.