Managing local disks for your gateway - AWS Storage Gateway

Amazon S3 File Gateway documentation has been moved to What is Amazon S3 File Gateway?

Volume Gateway documentation has been moved to What is Volume Gateway?

Tape Gateway documentation has been moved to What is Tape Gateway?

Managing local disks for your gateway

The gateway virtual machine (VM) uses the local disks that you allocate on-premises for buffering and storage. Gateways created on Amazon EC2 instances use Amazon EBS volumes as local disks.

Deciding the amount of local disk storage

The number and size of disks that you want to allocate for your gateway is up to you. File Gateways require at least one 150 GiB disk to use as a cache. The cache storage acts as the on-premises durable store for data that is pending upload to Amazon S3 or file system. After the initial configuration and deployment of your gateway, you can add more disks for cache storage as your workload demands increase.

Note

Underlying physical storage resources are represented as a data store in VMware. When you deploy the gateway VM, you choose a data store on which to store the VM files. When you provision a local disk (for example, to use as cache storage), you have the option to store the virtual disk in the same data store as the VM or a different data store.

If you have more than one data store, we strongly recommend that you choose one data store for the cache storage. A data store that is backed by only one underlying physical disk can lead to poor performance in some situations when it is used to back both the cache storage. This is also true if the backup is a less-performant RAID configuration such as RAID1.

Determining the size of cache storage to allocate

Your gateway uses its cache storage to provide low-latency access to your recently accessed data. The cache storage acts as the on-premises durable store for data that's pending upload to Amazon FSx.

When deploying an FSx File Gateway, consider how much cache disk to allocate. FSx File Gateway uses a least recently used algorithm to automatically evict data from the cache. The cache on an FSx File Gateway is shared between all of the file shares on that gateway. If you have multiple active shares, it's important to note that heavy utilization on one share could impact the amount of cache resources that another share has access to, possibly impacting performance.

When determining how much cache disk you need for a given workload, it's important to note that you can always add cache disk to your gateway (up to the current quotas on FSx File Gateway), but you can't decrease the cache for a given gateway. You can perform a basic analysis on the dataset to determine the right amount of cache disk, but there's not a way to determine exactly how much data is ‘hot,’ and needs to be stored locally, versus ‘cold’ and can be tiered to the cloud. Workloads change over time, and FSx File Gateway provides flexibility and elasticity related to the amount of resources that can be consumed. The amount of cache can always be increased, so starting small and increasing as needed is often the most cost-effective approach.

You can use an initial approximation of 150 GiB to provision disks for the cache storage during gateway setup. You can then use Amazon CloudWatch operational metrics to monitor the cache storage usage and provision more storage as needed using the console. For information on using the metrics and setting up alarms, see Performance and optimization.

Configuring additional cache storage

As your application needs change, you can increase the gateway's cache storage capacity. You can add storage capacity to your gateway without interrupting functionality or causing downtime. When you add more storage, you do so with the gateway VM turned on.

Important

When adding cache to an existing gateway, you must create new disks on the gateway host hypervisor or Amazon EC2 instance. Do not remove or change the size of existing disks that have already been allocated as cache.

To configure additional cache storage for your gateway
  1. Provision one or more new disks on your gateway host hypervisor or Amazon EC2 instance. For information about how to provision a disk on a hypervisor, see your hypervisor's documentation. For information about provisioning Amazon EBS volumes for an Amazon EC2 instance, see Amazon EBS volumes in the Amazon Elastic Compute Cloud User Guide for Linux Instances. In the following steps, you will configure this disk as cache storage.

  2. Open the Storage Gateway console at https://console.aws.amazon.com/storagegateway/home.

  3. In the navigation pane, choose Gateways.

  4. Search for your gateway and select it from the list.

  5. From the Actions menu, choose Configure cache storage.

  6. In the Configure cache storage section, identify the disks you provisioned. If you don't see your disks, choose the refresh icon to refresh the list. For each disk, choose Cache from the Allocated to drop-down menu.

    Note

    Cache is the only available option for allocating disks on a File Gateway.

  7. Choose Save changes to save your configuration settings.

Using ephemeral storage with EC2 gateways

We do not recommend the use of ephemeral disks for cache storage on FSx File Gateways.

Ephemeral disks provide temporary block-level storage for your Amazon EC2 instance. When you launch your gateway with an Amazon EC2 Amazon Machine Image and the instance type you select supports ephemeral storage, the ephemeral disks are listed automatically. You can select one of the disks to store your gateway's cache data. For more information, see Amazon EC2 instance store in the Amazon EC2 User Guide for Linux Instances.

Data that applications write to the gateway is stored synchronously in cache on the ephemeral disks, and then asynchronously uploaded to durable storage in FSx for Windows File Server. If the Amazon EC2 instance is stopped after data is written to ephemeral storage, but before an asynchronous upload occurs, any data that has not yet been uploaded to FSx for Windows File Server can be lost.

Important

If you stop and start an Amazon EC2 gateway that uses ephemeral storage, the gateway will be permanently offline. This happens because the physical storage disk is replaced. There is no work-around for this issue. The only resolution is to delete the gateway and activate a new one on a new EC2 instance.