Best practices for FSx for ONTAP deployments in enterprise environments
This section provides some best practices and considerations for deploying and operating Amazon FSx for NetApp ONTAP in enterprise environments. These recommendations are based on the experiences of AWS Professional Services.
In addition to the recommendations in this guide, adhere to the following best practices:
-
Best practices for working with Active Directory (FSx for ONTAP documentation)
-
Data protection (FSx for ONTAP documentation)
-
Security best practices in IAM (AWS Identity and Access Management (IAM) documentation)
-
Best practices and implementation guide for NetApp ONTAP FlexGroup volumes
(NetApp documentation)
Best practices for storage tiers and tiering policies
Storage tiers are the physical storage media for an Amazon FSx for NetApp ONTAP file system. The following storage tiers are available:
-
The SSD tier is high-performance solid-state drive (SSD) storage designed for active data, and you choose the storage size for this tier.
-
The capacity pool tier is fully elastic storage that is cost-optimized for infrequently accessed data. The SSD tier is significantly faster than the capacity pool tier. FSx for ONTAP SSD storage provides sub-millisecond file operational latencies, and the capacity pool tier provides tens of milliseconds of latency.
For more information about these tiers, see FSx for ONTAP storage tiers.
A tiering policy, which you configure at the volume level, determines if and when data that's stored in the SSD tier transitions to the capacity pool tier. FSx for ONTAP offers four different tiering policies: Snapshot Only, Auto, All, and None. For more information about each policy, see Tiering policies in the FSx for ONTAP documentation.
Consider the following recommendations when setting tiering policies for the volumes in your file share:
-
HPC workloads should access data in the SSD tier to prevent performance bottlenecks. For volumes accessed by HPC workloads, we recommend setting the tiering policy to None or Snapshot Only.
-
When migrating data to the file share, we recommend setting the target volume tiering policy to All. This reduces costs because all data migrates to the SSD tier and is then immediately moved to the capacity pool tier. In addition, if 98% or more of the SSD tier capacity is utilized, then writing to the tier is stopped. Setting the tiering policy to All prevents reaching this tiering threshold during the migration. After the migration is complete, you can change the tiering policy in order to balance performance and costs. For more information, see Migrating file shares to Amazon FSx for NetApp ONTAP using AWS DataSync
(AWS blog post).
Best practices for using the NetApp ONTAP maximum directory size
maxdirsizemaxdirsize
setting. The default value is 320 MB, which allows you
to store up to 4.3 million files in each directory.
You can increase the maxdirsize
value to support larger directories.
After the value has been increased, it cannot be decreased without recreating the
directory. Because directories are loaded in memory, there is a tradeoff between the
size of the directories and the performance of your file system. You can validate custom
settings only through a test. NetApp recommends that you keep this value at its default.
For more information, see Best practices
and implementation guide for NetApp ONTAP FlexGroup volumes
If you customize the maxdirsize
setting, you can use the following
formula to determine how many files can fit into a single folder.
max number of files in each directory = maxdirsize in MB × 53 ×
0,25
Best practices for monitoring FSx for ONTAP file systems
Similar to other AWS services, FSx for ONTAP is integrated with Amazon CloudWatch. CloudWatch helps you monitor the metrics of your AWS resources in near real time. Metrics are available at the file system and volume levels, and detailed monitoring metrics for these resources help you analyze them with more granular reporting detail. For more information, see Monitoring with Amazon CloudWatch in the FSx for ONTAP documentation. Consider the following recommendations when monitoring FSx for ONTAP by using CloudWatch:
-
We recommend that you use the
StorageUsed
file system metric so that you can filter monitoring results by storage tier. -
Use the
StorageCapacity
file system metric to configure a CloudWatch alarm that notifies you if more than 80% of the SSD tier capacity is utilized. This ensures that tiering functions properly for the volume, and it helps you maintain capacity for new data. For more information, see Tiering thresholds.
Best practices for choosing an Availability Zone deployment option
You can deploy Amazon FSx for NetApp ONTAP in a Single-AZ or Multi-AZ configuration. Each option provides different levels of availability and durability. For more information about these deployment options, see Availability and durability in the FSx for ONTAP documentation.
Multi-AZ deploys the FSx for ONTAP file system in an active-passive configuration. Therefore, all servers that connect to the file share use only the endpoint in the primary Availability Zone. The endpoint in the secondary Availability Zone is for failover only, and it is not used to read or write unless the primary Availability Zone fails.
You cannot change the Availability Zone deployment option after you create the FSx for ONTAP file system. To change the Availability Zone configuration, you have to create a new file system and then migrate the data to the new file system.
However, even if you deployed a file share using the Single-AZ option, you can still
access it from other Availability Zones. Your networking configuration, such as security
groups and network access control list (network ACL) must allow the clients to connect
to the file system endpoint. Using this approach, there is a charge for cross-AZ traffic
in each direction (read and write). For more information, see Amazon FSx for NetApp ONTAP Pricing
When choosing a deployment option, you must choose between the resiliency of the Multi-AZ configuration and the performance of the Single-AZ configuration. If practical for your use case, we recommend selecting Multi-AZ option because it provides high availability. However, the Single-AZ option can be more cost-effective and reduce latency. Consider the HPC workload and whether it can tolerate the additional latency.