Performance and optimization - AWS Storage Gateway

Performance and optimization

This section describes guidance and best practices for optimizing File Gateway performance.

Basic performance guidance for S3 File Gateway

In this section, you can find guidance for provisioning hardware for your S3 File Gateway VM. The instance configurations that are listed in the table are examples, and are provided for reference.

For best performance, the cache disk size must be tuned to the size of the active working set. Using multiple local disks for the cache increases write performance by parallelizing access to data and leads to higher IOPS.

Note

We don't recommend using ephemeral storage. For information about using ephemeral storage, see Using ephemeral storage with EC2 gateways.

For Amazon EC2 instances, if you have more than 5 million objects in your S3 bucket and you are using a General Purposes SSD volume, a minimum root EBS volume of 350 GiB is needed for acceptable performance of your gateway during start up. For information about how to increase your volume size, see Modifying an EBS volume using elastic volumes (console).

The suggested size limit for individual directories in the file shares that you connect to File Gateway is 10,000 files per directory. You can use File Gateway with directories that have more than 10,000 files, but performance might be impacted.

In the following tables, cache hit read operations are reads from the file shares that are served from cache. Cache miss read operations are reads from the file shares that are served from Amazon S3.

The following tables show example S3 File Gateway configurations.

S3 File Gateway performance on Linux clients

Example Configurations Protocol Write throughput (file sizes 1 GB) Cache hit read throughput Cache miss read throughput

Root disk: 80 GB, io1 SSD, 4,000 IOPS

Cache disk: 512 GiB cache, io1, 1,500 provisioned IOPS

Minimum network performance: 10 Gbps

CPU: 16 vCPU | RAM: 32 GB

NFS protocol recommended for Linux

NFSv3 - 1 thread 110 MiB/sec (0.92 Gbps) 590 MiB/sec (4.9 Gbps) 310 MiB/sec (2.6 Gbps)
NFSv3 - 8 threads 160 MiB/sec (1.3 Gbps) 590 MiB/sec (4.9 Gbps) 335 MiB/sec (2.8 Gbps)
NFSv4 - 1 thread 130 MiB/sec (1.1 Gbps) 590 MiB/sec (4.9 Gbps) 295 MiB/sec (2.5 Gbps)
NFSv4 - 8 threads 160 MiB/sec (1.3 Gbps) 590 MiB/sec (4.9 Gbps) 335 MiB/sec (2.8 Gbps)
SMBV3 - 1 thread 115 MiB/sec (1.0 Gbps) 325 MiB/sec (2.7 Gbps) 255 MiB/sec (2.1 Gbps)
SMBV3 - 8 threads 190 MiB/sec (1.6 Gbps) 590 MiB/sec (4.9 Gbps) 335 MiB/sec (2.8 Gbps)

Storage Gateway Hardware Appliance

Minimum network performance: 10 Gbps

NFSv3 - 1 thread 265 MiB/sec (2.2 Gbps) 590 MiB/sec (4.9 Gbps) 310 MiB/sec (2.6 Gbps)
NFSv3 - 8 threads 385 MiB/sec (3.1 Gbps) 590 MiB/sec (4.9 Gbps) 335 MiB/sec (2.8 Gbps)
NFSv4 - 1 thread 310 MiB/sec (2.6 Gbps) 590 MiB/sec (4.9 Gbps) 295 MiB/sec (2.5 Gbps)
NFSv4 - 8 threads 385 MiB/sec (3.1 Gbps) 590 MiB/sec (4.9 Gbps) 335 MiB/sec (2.8 Gbps)
SMBV3 - 1 thread 275 MiB/sec (2.4 Gbps) 325 MiB/sec (2.7 Gbps) 255 MiB/sec (2.1 Gbps)
SMBV3 - 8 threads 455 MiB/sec (3.8 Gbps) 590 MiB/sec (4.9 Gbps) 335 MiB/sec (2.8 Gbps)

Root disk: 80 GB, io1 SSD, 4,000 IOPS

Cache disk: 4 x 2 TB NVME cache disks

Minimum network performance: 10 Gbps

CPU: 32 vCPU | RAM: 244 GB

NFS protocol recommended for Linux

NFSv3 - 1 thread 300 MiB/sec (2.5 Gbps) 590 MiB/sec (4.9 Gbps) 325 MiB/sec (2.7 Gbps)
NFSv3 - 8 threads 585 MiB/sec (4.9 Gbps) 590 MiB/sec (4.9 Gbps) 580 MiB/sec (4.8 Gbps)
NFSv4 - 1 thread 355 MiB/sec (3.0 Gbps) 590 MiB/sec (4.9 Gbps) 340 MiB/sec (2.9 Gbps)
NFSv4 - 8 threads 575 MiB/sec (4.8 Gbps) 590 MiB/sec (4.9 Gbps) 575 MiB/sec (4.8 Gbps)
SMBV3 - 1 thread 230 MiB/sec (1.9 Gbps) 325 MiB/sec (2.7 Gbps) 245 MiB/sec (2.0 Gbps)
SMBV3 - 8 threads 585 MiB/sec (4.9 Gbps) 590 MiB/sec (4.9 Gbps) 580 MiB/sec (4.8 Gbps)

File Gateway performance on Windows clients

Example Configurations Protocol Write throughput (file sizes 1 GB) Cache hit read throughput Cache miss read throughput

Root disk: 80 GB, io1 SSD, 4,000 IOPS

Cache disk: 512 GiB cache, io1, 1,500 provisioned IOPS

Minimum network performance: 10 Gbps

CPU: 16 vCPU | RAM: 32 GB

SMB protocol recommended for Windows

SMBV3 - 1 thread 150 MiB/sec (1.3 Gbps) 180 MiB/sec (1.5 Gbps) 20 MiB/sec (0.2 Gbps)
SMBV3 - 8 threads 190 MiB/sec (1.6 Gbps) 335 MiB/sec (2.8 Gbps) 195 MiB/sec (1.6 Gbps)
NFSv3 - 1 thread 95 MiB/sec (0.8 Gbps) 130 MiB/sec (1.1 Gbps) 20 MiB/sec (0.2 Gbps)
NFSv3 - 8 threads 190 MiB/sec (1.6 Gbps) 330 MiB/sec (2.8 Gbps) 190 MiB/sec (1.6 Gbps)

Storage Gateway Hardware Appliance

Minimum network performance: 10 Gbps

SMBV3 - 1 thread 230 MiB/sec (1.9 Gbps) 255 MiB/sec (2.1 Gbps) 20 MiB/sec (0.2 Gbps)
SMBV3 - 8 threads 835 MiB/sec (7.0 Gbps) 475 MiB/sec (4.0 Gbps) 195 MiB/sec (1.6 Gbps)
NFSv3 - 1 thread 135 MiB/sec (1.1 Gbps) 185 MiB/sec (1.6 Gbps) 20 MiB/sec (0.2 Gbps)
NFSv3 - 8 threads 545 MiB/sec (4.6 Gbps) 470 MiB/sec (4.0 Gbps) 190 MiB/sec (1.6 Gbps)

Root disk: 80 GB, io1 SSD, 4,000 IOPS

Cache disk: 4 x 2 TB NVME cache disks

Minimum network performance: 10 Gbps

CPU: 32 vCPU | RAM: 244 GB

SMB protocol recommended for Windows

SMBV3 - 1 thread 230 MiB/sec (1.9 Gbps) 265 MiB/sec (2.2 Gbps) 30 MiB/sec (0.3 Gbps)
SMBV3 - 8 threads 835 MiB/sec (7.0 Gbps) 780 MiB/sec (6.5 Gbps) 250 MiB/sec (2.1 Gbps)
NFSv3 - 1 thread 135 MiB/sec (1.1. Gbps) 220 MiB/sec (1.8 Gbps) 30 MiB/sec (0.3 Gbps)
NFSv3 - 8 threads 545 MiB/sec (4.6 Gbps) 570 MiB/sec (4.8 Gbps) 240 MiB/sec (2.0 Gbps)
Note

Your performance might vary based on your host platform configuration and network bandwidth. Write throughput performance decreases with file size, with the highest achievable throughput for small files (less than 32MiB) being 16 files per second.

Performance guidance for gateways with multiple file shares

Amazon S3 File Gateway supports attaching up to 50 file shares to a single Storage Gateway appliance. By adding multiple file shares per gateway, you can support more users and workloads while managing fewer gateways and virtual hardware resources. In addition to other factors, the number of file shares managed by a gateway can affect that gateway's performance. This section describes how gateway performance is expected to change depending on the number of attached file shares and recommends virtual hardware configurations to optimize performance for gateways that manage multiple shares.

In general, increasing the number of file shares managed by a single Storage Gateway can have the following consequences:

  • Increased time required to restart the gateway.

  • Increased utilization of virtual hardware resources such as vCPU and RAM.

  • Decreased performance for data and metadata operations if virtual hardware resources become saturated.

The following table lists recommended virtual hardware configurations for gateways that manage multiple file shares:

File Shares Per Gateway Recommended Gateway Capacity Setting Recommended vCPU Cores Recommended RAM Recommended Disk Size

1-10

Small

4 (EC2 instance type m4.xlarge or greater)

16 GiB

80 GiB

10-20

Medium

8 (EC2 instance type m4.2xlarge or greater)

32 GiB

160 GiB

20+

Large

16 (EC2 instance type m4.4xlarge or greater)

64 GiB

240 GiB

In addition to the virtual hardware configurations recommended above, we recommend the following best practices for configuring and maintaining Storage Gateway appliances that manage multiple file shares:

  • Consider that the relationship between the number of file shares and the demand placed on the gateway's virtual hardware is not necessarily linear. Some file shares might generate more throughput, and therefore more hardware demand than others. The recommendations in the preceding table are based on maximum hardware capacities and various file share throughput levels.

  • If you find that adding multiple file shares to a single gateway reduces performance, consider moving the most active file shares to other gateways. In particular, if a file share is used for a very-high-throughput application, consider creating a separate gateway for that file share.

  • We do not recommend configuring one gateway for multiple high-throughput applications and another for multiple low-throughput applications. Instead, try to spread high and low throughput file shares evenly across gateways to balance hardware saturation. To measure your file share throughput, use the ReadBytes and WriteBytes metrics. For more information, see Understanding file share metrics.

Optimizing gateway performance

You can find information following about how to optimize the performance of your gateway. The guidance is based on adding resources to your gateway and adding resources to your application server.

Add Resources to Your Gateway

You can optimize gateway performance by adding resources to your gateway in one or more of the following ways.

Use higher-performance disks

To optimize gateway performance, you can add high-performance disks such as solid-state drives (SSDs) and a NVMe controller. You can also attach virtual disks to your VM directly from a storage area network (SAN) instead of the Microsoft Hyper-V NTFS. Improved disk performance generally results in better throughput and more input/output operations per second (IOPS). For information about adding disks, see Configuring additional cache storage.

To measure throughput, use the ReadBytes and WriteBytes metrics with the Samples Amazon CloudWatch statistic. For example, the Samples statistic of the ReadBytes metric over a sample period of 5 minutes divided by 300 seconds gives you the IOPS. As a general rule, when you review these metrics for a gateway, look for low throughput and low IOPS trends to indicate disk-related bottlenecks.

Note

CloudWatch metrics are not available for all gateways. For information about gateway metrics, see Monitoring your S3 File Gateway.

Add CPU resources to your gateway host

The minimum requirement for a gateway host server is four virtual processors. To optimize gateway performance, confirm that the four virtual processors that are assigned to the gateway VM are backed by four cores. In addition, confirm that you are not oversubscribing the CPUs of the host server.

When you add additional CPUs to your gateway host server, you increase the processing capability of the gateway. Doing this allows your gateway to deal with, in parallel, both storing data from your application to your local storage and uploading this data to Amazon S3. Additional CPUs also help ensure that your gateway gets enough CPU resources when the host is shared with other VMs. Providing enough CPU resources has the general effect of improving throughput.

Storage Gateway supports using 24 CPUs in your gateway host server. You can use 24 CPUs to significantly improve the performance of your gateway. We recommend the following gateway configuration for your gateway host server:

  • 24 CPUs.

  • 16 GiB of reserved RAM for File Gateways

    • 16 GiB of reserved RAM for gateways with cache size up to 16 TiB

    • 32 GiB of reserved RAM for gateways with cache size 16 TiB to 32 TiB

    • 48 GiB of reserved RAM for gateways with cache size 32 TiB to 64 TiB

  • Disk 1 attached to paravirtual controller 1, to be used as the gateway cache as follows:

    • SSD using an NVMe controller.

  • Network adapter 1 configured on VM network 1:

    • Use VM network 1 and add VMXnet3 (10 Gbps) to be used for ingestion.

  • Network adapter 2 configured on VM network 2:

    • Use VM network 2 and add a VMXnet3 (10 Gbps) to be used to connect to AWS.

Back gateway virtual disks with separate physical disks

When you provision gateway disks, we strongly recommend that you don't provision local disks for local storage that use the same underlying physical storage disk. For example, for VMware ESXi, the underlying physical storage resources are represented as a data store. When you deploy the gateway VM, you choose a data store on which to store the VM files. When you provision a virtual disk (for example, as an upload buffer), you can store the virtual disk in the same data store as the VM or a different data store.

If you have more than one data store, then we strongly recommend that you choose one data store for each type of local storage you are creating. A data store that is backed by only one underlying physical disk can lead to poor performance. An example is when you use such a disk to back both the cache storage and upload buffer in a gateway setup. Similarly, a data store that is backed by a less high-performing RAID configuration such as RAID 1 can lead to poor performance.

Add Resources to Your Application Environment

Increase the bandwidth between your application server and your gateway

To optimize gateway performance, ensure that the network bandwidth between your application and the gateway can sustain your application needs. You can use the ReadBytes and WriteBytes metrics of the gateway to measure the total data throughput.

For your application, compare the measured throughput with the desired throughput. If the measured throughput is less than the desired throughput, then increasing the bandwidth between your application and gateway can improve performance if the network is the bottleneck. Similarly, you can increase the bandwidth between your VM and your local disks, if they're not direct-attached.

Add CPU resources to your application environment

If your application can use additional CPU resources, then adding more CPUs can help your application to scale its I/O load.