Performance - AWS Storage Gateway

Amazon S3 File Gateway documentation has been moved to What is Amazon S3 File Gateway?

Amazon FSx File Gateway documentation has been moved to What is Amazon FSx File Gateway?

Tape Gateway documentation has been moved to What is Tape Gateway?

Performance

This section describes Storage Gateway performance.

Optimizing Gateway Performance

The following bottlenecks can reduce the performance of your Tape Gateway below the theoretical maximum:

  • CPU core count

  • Disk throughput

  • Total RAM count

  • Network bandwidth to AWS

  • Network bandwidth from initiator to gateway

In this section, you can find information about how to optimize the performance of your gateway. The guidance is based on adding resources to your gateway and adding resources to your application server.

Add Resources to Your Gateway

You can optimize gateway performance by adding resources to your gateway in one or more of the following ways.

Use higher-performance disks

Disk throughput and CPU core count can limit your upload and download performance below the theoretical maximum. If your gateway is exhibiting performance significantly below the theoretical maximum consider:

  • Add more CPU cores

  • Improve cache and upload buffer disk throughput. You can use a striped RAID such as RAID 0 (or RAID 5 for data protection) to accomplish this.

    RAID (redundant array of independent disks) or disk striping, is the process of dividing a body of data into blocks and spreading the data blocks across multiple storage devices. The RAID level you use affects the exact speed and fault tolerance you can achieve from RAID.

To optimize gateway performance, you can add high-performance disks such as solid-state drives (SSDs) and a NVMe controller. You can also attach virtual disks to your VM directly from a storage area network (SAN) instead of the Microsoft Hyper-V NTFS. Improved disk performance generally results in better throughput and more input/output operations per second (IOPS).

To measure throughput, use the ReadBytes and WriteBytes metrics with the Samples Amazon CloudWatch statistic. For example, the Samples statistic of the ReadBytes metric over a sample period of 5 minutes divided by 300 seconds gives you the IOPS. As a general rule, when you review these metrics for a gateway, look for low throughput and low IOPS trends to indicate disk-related bottlenecks. .

Note

CloudWatch metrics are not available for all gateways. For information about gateway metrics, see Monitoring Storage Gateway.

Add CPU resources to your gateway host

The minimum requirement for a gateway host server is four virtual processors. To optimize gateway performance, confirm that the four virtual processors that are assigned to the gateway VM are backed by four cores. In addition, confirm that you are not oversubscribing the CPUs of the host server.

When you add additional CPUs to your gateway host server, you increase the processing capability of the gateway. Doing this allows your gateway to deal with, in parallel, both storing data from your application to your local storage and uploading this data to Amazon S3. Additional CPUs also help ensure that your gateway gets enough CPU resources when the host is shared with other VMs. Providing enough CPU resources has the general effect of improving throughput.

Storage Gateway supports using 24 CPUs in your gateway host server. You can use 24 CPUs to significantly improve the performance of your gateway. We recommend the following gateway configuration for your gateway host server:

  • 24 CPUs.

  • For Volume Gateway, your hardware should dedicate the following amounts of RAM:

    • 16 GiB of reserved RAM for gateways with cache size up to 16 TiB

    • 32 GiB of reserved RAM for gateways with cache size 16 TiB to 32 TiB

    • 48 GiB of reserved RAM for gateways with cache size 32 TiB to 64 TiB

    For the gateway to perform at its full potential, you need at least 27 GiB of RAM.

  • Disk 1 attached to paravirtual controller 1, to be used as the gateway cache as follows:

    • SSD using an NVMe controller.

  • Disk 2 attached to paravirtual controller 1, to be used as the gateway upload buffer as follows:

    • SSD using an NVMe controller.

  • Disk 3 attached to paravirtual controller 2, to be used as the gateway upload buffer as follows:

    • SSD using an NVMe controller.

  • Network adapter 1 configured on VM network 1:

    • Use VM network 1 and add VMXnet3 (10 Gbps) to be used for ingestion.

  • Network adapter 2 configured on VM network 2:

    • Use VM network 2 and add a VMXnet3 (10 Gbps) to be used to connect to AWS.

Back gateway virtual disks with separate physical disks

When you provision gateway disks, we strongly recommend that you don't provision local disks for the upload buffer and cache storage that use the same underlying physical storage disk. For example, for VMware ESXi, the underlying physical storage resources are represented as a data store. When you deploy the gateway VM, you choose a data store on which to store the VM files. When you provision a virtual disk (for example, as an upload buffer), you can store the virtual disk in the same data store as the VM or a different data store.

If you have more than one data store, then we strongly recommend that you choose one data store for each type of local storage you are creating. A data store that is backed by only one underlying physical disk can lead to poor performance. An example is when you use such a disk to back both the cache storage and upload buffer in a gateway setup. Similarly, a data store that is backed by a less high-performing RAID configuration such as RAID 1 can lead to poor performance.

Change the volumes configuration

For Volume Gateways, if you find that adding more volumes to a gateway reduces the throughput to the gateway, consider adding the volumes to a separate gateway. In particular, if a volume is used for a high-throughput application, consider creating a separate gateway for the high-throughput application. However, as a general rule, you should not use one gateway for all of your high-throughput applications and another gateway for all of your low-throughput applications. To measure your volume throughput, use the ReadBytes and WriteBytes metrics.

For more information about these metrics, see Measuring Performance Between Your Application and Gateway.

Optimize iSCSI Settings

You can optimize iSCSI settings on your iSCSI initiator to achieve higher I/O performance. We recommend choosing 256 KiB for MaxReceiveDataSegmentLength and FirstBurstLength, and 1 MiB for MaxBurstLength. For more information about configuring iSCSI settings, see Customizing iSCSI Settings.

Note

These recommended settings can enable overall better performance. However, the specific iSCSI settings that are needed to optimize performance vary depending on which backup software you use. For details, see your backup software documentation.

Use a Larger Block Size for Tape Drives

For a Tape Gateway, the default block size for a tape drive is 64 KB. However, you can increase the block size up to 1 MB to improve I/O performance.

The block size that you choose depends on the maximum block size that your backup software supports. We recommend that you set the block size of the tape drives in your backup software to a size that is as large as possible. However, this block size must not be greater than the 1 MB maximum size that the gateway supports.

Tape Gateways negotiate the block size for virtual tape drives to automatically match what is set on the backup software. When you increase the block size on the backup software, we recommend that you also check the settings to ensure that the host initiator supports the new block size. For more information, see the documentation for your backup software. For more information about specific gateway performance guidance, see Performance.

Optimize the Performance of Virtual Tape Drives in the Backup Software

Your backup software can back up data on up to 10 virtual tape drives on a Tape Gateway at the same time. We recommend that you configure backup jobs in your backup software to use at least 4 virtual tape drives simultaneous on the Tape Gateway. You can achieve better write throughput when the backup software is backing up data to more than one virtual tape at the same time.

Add Resources to Your Application Environment

Increase the bandwidth between your application server and your gateway

Initiator to Gateway connection can limit your upload and download performance below the theoretical maximum. If your gateway is exhibiting performance significantly below the theoretical maximum and you have already improved your CPU core count and disk throughput, consider:

  • Upgrading your network cables to have higher bandwidth between your initiator and gateway.

  • Using as many tape drives as possible to see maximum performance. iSCSI has a queue depth of 1, meaning that the more tape drives you use, the more requests that your gateway can service concurrently. This will allow you to more fully utilize the bandwidth between your gateway and initiator, increasing your throughput

Network bandwidth to AWS is the theoretical maximum performance of your gateway

  • Your gateway’s sustained write speed will never exceed your upload bandwidth to AWS.

  • Your gateway’s sustained read speed will never exceed your download bandwidth from AWS.

  • You will likely not reach the theoretical maximum performance in practice due to other limiting factors, such as Disk throughput and CPU core count, size of RAM, initiator gateway connection etc.

To optimize gateway performance, ensure that the network bandwidth between your application and the gateway can sustain your application needs. You can use the ReadBytes and WriteBytes metrics of the gateway to measure the total data throughput.

For your application, compare the measured throughput with the desired throughput. If the measured throughput is less than the desired throughput, then increasing the bandwidth between your application and gateway can improve performance if the network is the bottleneck. Similarly, you can increase the bandwidth between your VM and your local disks, if they're not direct-attached.

Add CPU resources to your application environment

If your application can use additional CPU resources, then adding more CPUs can help your application to scale its I/O load.

Using VMware vSphere High Availability with Storage Gateway

Storage Gateway provides high availability on VMware through a set of application-level health checks integrated with VMware vSphere High Availability (VMware HA). This approach helps protect storage workloads against hardware, hypervisor, or network failures. It also helps protect against software errors, such as connection timeouts and file share or volume unavailability.

With this integration, a gateway deployed in a VMware environment on-premises or in a VMware Cloud on AWS automatically recovers from most service interruptions. It generally does this in under 60 seconds with no data loss.

To use VMware HA with Storage Gateway, take the steps listed following.

Configure Your vSphere VMware HA Cluster

First, if you haven’t already created a VMware cluster, create one. For information about how to create a VMware cluster, see Create a vSphere HA Cluster in the VMware documentation.

Next, configure your VMware cluster to work with Storage Gateway.

To configure your VMware cluster

  1. On the Edit Cluster Settings page in VMware vSphere, make sure that VM monitoring is configured for VM and application monitoring. To do so, set the following options as listed:

    • Host Failure Response: Restart VMs

    • Response for Host Isolation: Shut down and restart VMs

    • Datastore with PDL: Disabled

    • Datastore with APD: Disabled

    • VM Monitoring: VM and Application Monitoring

    For an example, see the following screenshot.

    
                        Editing cluster settings
  2. Fine-tune the sensitivity of the cluster by adjusting the following values:

    • Failure interval – After this interval, the VM is restarted if a VM heartbeat isn't received.

    • Minimum uptime – The cluster waits this long after a VM starts to begin monitoring for VM tools' heartbeats.

    • Maximum per-VM resets – The cluster restarts the VM a maximum of this many times within the maximum resets time window.

    • Maximum resets time window – The window of time in which to count the maximum resets per-VM resets.

    If you aren't sure what values to set, use these example settings:

    • Failure interval: 30 seconds

    • Minimum uptime: 120 seconds

    • Maximum per-VM resets: 3

    • Maximum resets time window: 1 hour

If you have other VMs running on the cluster, you might want to set these values specifically for your VM. You can't do this until you deploy the VM from the .ova. For more information on setting these values, see (Optional) Add Override Options for Other VMs on Your Cluster.

Download the .ova Image from the Storage Gateway console

To download the .ova image for your gateway

  • On the Set up gateway page in the Storage Gateway console, select your gateway type and host platform, then use the link provided in the console to download the .ova as outlined in Set up a Volume Gateway.

Deploy the Gateway

In your configured cluster, deploy the .ova image to one of the cluster's hosts.

To deploy the gateway .ova image

  1. Deploy the .ova image to one of the hosts in the cluster.

  2. Make sure the data stores that you choose for the root disk and the cache are available to all hosts in the cluster. When deploying the Storage Gateway .ova file in a VMware or on-prem environment, the disks are described as paravirtualized SCSI disks. Paravirtualization is a mode where the gateway VM works with the host operating system so the console can identify the virtual disks that you add to your VM.

    To configure your VM to use paravirtualized controllers

    1. In the VMware vSphere client, open the context (right-click) menu for your gateway VM, and then choose Edit Settings.

    2. In the Virtual Machine Properties dialog box, choose the Hardware tab, select the SCSI controller 0, and then choose Change Type.

    3. In the Change SCSI Controller Type dialog box, select the VMware Paravirtual SCSI controller type, and then choose OK.

(Optional) Add Override Options for Other VMs on Your Cluster

If you have other VMs running on your cluster, you might want to set the cluster values specifically for each VM.

To add override options for other VMs on your cluster

  1. On the Summary page in VMware vSphere, choose your cluster to open the cluster page, and then choose Configure.

  2. Choose the Configuration tab, and then choose VM Overrides.

  3. Add a new VM override option to change each value.

    For override options, see the following screenshot.

    
                        Override cluster settings

Activate Your Gateway

After the .ova for your gateway is deployed, activate your gateway. The instructions about how are different for each gateway type.

To activate your gateway

Test Your VMware High Availability Configuration

After you activate your gateway, test your configuration.

To test your VMware HA configuration

  1. Open the Storage Gateway console at https://console.aws.amazon.com/storagegateway/home.

  2. On the navigation pane, choose Gateways, and then choose the gateway that you want to test for VMware HA.

  3. For Actions, choose Verify VMware HA.

  4. In the Verify VMware High Availability Configuration box that appears, choose OK.

    Note

    Testing your VMware HA configuration reboots your gateway VM and interrupts connectivity to your gateway. The test might take a few minutes to complete.

    If the test is successful, the status of Verified appears in the details tab of the gateway in the console.

  5. Choose Exit.

You can find information about VMware HA events in the Amazon CloudWatch log groups. For more information, see Getting Volume Gateway Health Logs with CloudWatch Log Groups.