Getting started with Amazon FSx for Lustre

Focus mode

Getting started with Amazon FSx for Lustre - FSx for Lustre

Prerequisites Step 1: Create your FSx for Lustre file system Install the Lustre client Step 3: Mount the file system Step 4: Run your workflow Step 5: Clean up resources

Following, you can learn how to get started using Amazon FSx for Lustre. These steps walk you through creating an Amazon FSx for Lustre file system and accessing it from your compute instances. Optionally, they show how to use your Amazon FSx for Lustre file system to process the data in your Amazon S3 bucket with your file-based applications.

This getting started exercise includes the following steps.

Topics

Prerequisites
Step 1: Create your FSx for Lustre file system
Step 2: Install and configure the Lustre client
Step 3: Mount the file system
Step 4: Run your workflow
Step 5: Clean up resources

Prerequisites

To perform this getting started exercise, you need the following:

An AWS account with the permissions necessary to create an Amazon FSx for Lustre file system and an Amazon EC2 instance. For more information, see Setting up Amazon FSx for Lustre.
Create a Amazon VPC security group to be associated with your FSx for Lustre file system, and do not change it after file system creation. For more information, see To create a security group for your Amazon FSx file system.
An Amazon EC2 instance running a supported Linux release in your virtual private cloud (VPC) based on the Amazon VPC service. For this getting started exercise, we recommend using Amazon Linux 2023. You will install the Lustre client on this EC2 instance, and then mount your FSx for Lustre file system on the EC2 instance. For more information on creating an EC2 instance, see Getting started: Launch an instance or Launch your instance in the Amazon EC2 User Guide.

Besides Amazon Linux 2023, the Lustre client supports the Amazon Linux 2, Red Hat Enterprise Linux (RHEL), CentOS, Rocky Linux, SUSE Linux Enterprise Server, and Ubuntu operating systems. For more information, see Lustre file system and client kernel compatibility.
When creating your Amazon EC2 instance for this getting started exercise, keep the following in mind:
- We recommend that you create your instance in your default VPC.
- We recommend that you use the default security group when creating your EC2 instance.
Determine which type of Amazon FSx for Lustre file system you want to create, scratch or persistent. For more information, see Deployment options for FSx for Lustre file systems.

Each FSx for Lustre file system requires one IP address for each metadata server (MDS) and one IP address for each storage server (OSS).

File System Type	Throughput, MBps/TiB	Storage per OSS
Persistent 2 EFA	125	38.4 TiB per OSS
	250	19.2 TiB per OSS
	500	9.6 TiB per OSS
	1000	4.8 TiB per OSS
Persistent 2 non-EFA	125, 250, 500, 1000	2.4 TiB per OSS
Persistent 1 SSD	50, 100, 200	2.4 TiB per OSS
Persistent HDD	12	6 TiB per OSS
Persistent HDD	40	1.8 TiB per OSS
Scratch 2	200	2.4 TiB per OSS
Scratch 1	200	3.6 TiB per OSS

An Amazon S3 bucket storing the data for your workload to process. The S3 bucket will be the linked durable data repository for your FSx for Lustre file system.

Step 1: Create your FSx for Lustre file system

You create your file system in the Amazon FSx console.

To create your file system

Open the Amazon FSx console at https://console.aws.amazon.com/fsx/.
From the dashboard, choose Create file system to start the file system creation wizard.
Choose FSx for Lustre and then choose Next to display the Create File System page.
Provide the information in the File system details section:
- For File system name-optional, provide a name for your file system. You can use up to 256 Unicode letters, white space, and numbers plus the special characters + - = . _ : /.
- For Deployment and storage class, choose one of the options:
  - Choose the Persistent, SSD deployment type for longer-term storage and for latency-sensitive workloads requiring the highest levels of IOPS/throughput. Persistent, SSD uses Persistent 2, the latest-generation of persistent file systems.
    
    Optionally, choose with EFA support to enable Elastic Fabric Adapter (EFA) support for the file system. For more information about EFA, see Working with EFA-enabled file systems.
  - Choose the Persistent, HDD deployment type for longer-term storage and for throughput-focused workloads that aren't latency-sensitive. Persistent, HDD uses the Persistent 1 deployment type.
    
    Optionally, choose with SSD cache to create an SSD cache that is sized to 20 percent of your HDD storage capacity to provide sub-millisecond latencies and higher IOPS for frequently accessed files.
  - Choose the Scratch, SSD deployment type for temporary storage and shorter-term processing of data. Scratch, SSD uses Scratch 2 file systems.
- Choose the amount of Throughput per unit of storage for your file system. This option is only valid for Persistent deployment types.
  
  Throughput per unit of storage is the amount of read and write throughput for each 1 tebibyte (TiB) of storage provisioned, in MBps/TiB. You pay for the amount of throughput that you provision:
  - For Persistent SSD storage, choose a value of 125, 250, 500, or 1,000 MBps/TiB.
  - For Persistent HDD storage, choose a value of 12 or 40 MBps/TiB.
- For Storage capacity, set the amount of storage capacity for your file system, in TiB:
  - For a Persistent, SSD deployment type, set this to a value of 1.2 TiB, 2.4 TiB, or increments of 2.4 TiB.
  - For an EFA-enabled, Persistent, SSD deployment type, set this value in increments of 4.8 TiB, 9.6 TiB, 19.2 TiB, and 38.4 TiB for 1000, 500, 250, and 125 MBps/TiB throughput tiers, respectively.
  - For a Persistent, HDD deployment type, this value can be increments of 6.0 TiB for 12 MBps/TiB file systems and increments of 1.8 TiB for 40 MBps/TiB file systems.
  You can increase the amount of storage capacity as needed after you create the file system. For more information, see Managing storage capacity.
- For Metadata Configuration, you have two options to provision the number of Metadata IOPS for your file system:
  - Choose Automatic (the default) if you want Amazon FSx to automatically provision and scale the Metadata IOPS on your file system based on your file system's storage capacity.
  - Choose User-provisioned if you want to specify the number of Metadata IOPS to provision for your file system. Valid values are 1500, 3000, 6000, 12000, and multiples of 12000, up to a maximum of 192000.
  For more information about Metadata IOPS, see Lustre metadata performance configuration.
- For Data compression type, choose NONE to turn off data compression or choose LZ4 to turn on data compression with the LZ4 algorithm. For more information, see Lustre data compression.
All FSx for Lustre file systems are built on Lustre version 2.15 when created using the Amazon FSx console.

In the Network & security section, provide the following networking and security group information:

For Virtual Private Cloud (VPC), choose the VPC that you want to associate with your file system. For this getting started exercise, choose the same VPC that you chose for your Amazon EC2 instance.

For VPC security groups, the ID for the default security group for your VPC should be already added.

If you're not using the default security group, make sure that the following inbound rule is added to the security group you're using for this getting started exercise.

Type	Protocol	Port range	Source	Description
All TCP	TCP	0-65535	Custom `the_ID_of_this_security_group`	Inbound Lustre traffic rule

Important

Make sure that the security group you are using follows the configuration instructions provided in File system access control with Amazon VPC. You must set up the security group to allow inbound traffic on ports 988 and 1018-1023 from the security group itself or the full subnet CIDR, which is required to allow the file system hosts to communicate with each other.
If you are creating an EFA-enabled file system, make sure you specify an EFA-enabled security group.

For Subnet, choose any value from the list of available subnets.

For the Encryption section, the options available vary depending upon which file system type you're creating:
- For a persistent file system, you can choose an AWS Key Management Service (AWS KMS) encryption key to encrypt the data on your file system at rest.
- For a scratch file system, data at rest is encrypted using keys managed by AWS.
- For scratch 2 and persistent file systems, data in transit is encrypted automatically when the file system is accessed from a supported Amazon EC2 instance type. For more information, see Encrypting data in transit.
For the Data Repository Import/Export - optional section, linking your file system to Amazon S3 data repositories is disabled by default. For information about enabling this option and creating a data repository association to an existing S3 bucket, see To link an S3 bucket while creating a file system (console).
Important
- Selecting this option also disables backups and you won't be able to enable backups while creating the file system.
- If you link one or more Amazon FSx for Lustre file systems to an Amazon S3 bucket, don't delete the Amazon S3 bucket until all linked file systems have been deleted.
For Logging - optional, logging is enabled by default. When enabled, failures and warnings for data repository activity on your file system are logged to Amazon CloudWatch Logs. For information about configuring logging, see Managing logging.
In Backup and maintenance - optional, you can do the following.

For daily automatic backups:
- Disable the Daily automatic backup. This option is enabled by default, unless you enabled Data Repository Import/Export,.
- Set the start time for Daily automatic backup window.
- Set the Automatic backup retention period, from 1 - 35 days.
For more information, see Protecting your data with backups.
Set the Weekly maintenance window start time, or keep it set to the default No preference.
For Root Squash - optional, root squash is disabled by default. For information about enabling and configuring root squash, see To enable root squash when creating a file system (console).
Create any tags that you want to apply to your file system.
Choose Next to display the Create file system summary page.
Review the settings for your Amazon FSx for Lustre file system, and choose Create file system.

Now that you've created your file system, note its fully qualified domain name and mount name for a later step. You can find the fully qualified domain name and mount name for a file system by choosing the name of the file system in the Caches dashboard, and then choosing Attach.

Step 2: Install and configure the Lustre client

Before you can access your Amazon FSx for Lustre file system from your Amazon EC2 instance, you have to do the following:

Verify your EC2 instance meets the minimum kernel requirements.
Update the kernel if needed.
Download and install the Lustre client.

To check the kernel version and download the Lustre client

Open a terminal window on your EC2 instance.
Determine which kernel is currently running on your compute instance by running the following command.
```
uname -r
```
Do one of the following:
- If the command returns 6.1.79-99.167.amzn2023.x86_64 for x86-based EC2 instances, or 6.1.79-99.167.amzn2023.aarch64 or higher for Graviton2-based EC2 instances, download and install the Lustre client with the following command.
```
sudo dnf install -y lustre-client
```
- If the command returns a result less than 6.1.79-99.167.amzn2023.x86_64 for x86-based EC2 instances, or less than 6.1.79-99.167.amzn2023.aarch64 for Graviton2-based EC2 instances, update the kernel and reboot your Amazon EC2 instance by running the following command.
```
sudo dnf -y update kernel && sudo reboot
```
  Confirm that the kernel has been updated using the uname -r command. Then download and install the Lustre client as described above.
For information about installing the Lustre client on other Linux distributions, see Installing the Lustre client.

Step 3: Mount the file system

To mount your file system, you will create a mounting directory, or mount point, and then mount the file system on to your client, and verify that your client can access the file system.

To mount your file system

Make a directory for the mount point with the following command.
```
sudo mkdir -p /mnt/fsx
```
Mount the Amazon FSx for Lustre file system to the directory that you created. Use the following command and replace the following items:
- Replace file_system_dns_name with the actual file system's Domain Name System (DNS) name.
- Replace mountname with the file system's mount name, which you can get by running the describe-file-systems AWS CLI command or the DescribeFileSystems API operation.
```
sudo mount -t lustre -o relatime,flock file_system_dns_name@tcp:/mountname /mnt/fsx
```
This command mounts your file system with two options, -o relatime and flock:
- relatime – While the atime option maintains atime (inode access times) data for each time a file is accessed, the relatime option also maintains atime data, but not for each time that a file is accessed. With the relatime option enabled, atime data is written to disk only if the file has been modified since the atime data was last updated (mtime), or if the file was last accessed more than a certain amount of time ago (6 hours by default). Using either the relatime or atime option will optimize the file release processes.
  
  Note
  If your workload requires precise access time accuracy, you can mount with the atime mount option. However, doing so can impact workload performance by increasing the network traffic required to maintain precise access time values.
  If your workload does not require metadata access time, using the noatime mount option to disable updates to access time can provide a performance gain. Be aware that atime focused processes like file release or releasing data validity will be inaccurate in their release.
- flock – Enables file locking for your file system. If you don't want file locking enabled, use the mount command without flock.

Verify that the mount command was successful by listing the contents of the directory to which you mounted the file system /mnt/fsx, by using the following command.


ls /mnt/fsx
import-path  lustre
$

You can also use the df command, following.


df
Filesystem                      1K-blocks    Used  Available Use% Mounted on
devtmpf                          1001808       0    1001808   0% /dev
tmpfs                            1019760       0    1019760   0% /dev/shm
tmpfs                            1019760     392    1019368   1% /run
tmpfs                            1019760       0    1019760   0% /sys/fs/cgroup
/dev/xvda1                       8376300 1263180    7113120  16% /
123.456.789.0@tcp:/mountname  3547698816   13824 3547678848   1% /mnt/fsx
tmpfs                             203956       0     203956   0% /run/user/1000

The results show the Amazon FSx file system mounted on /mnt/fsx.

Step 4: Run your workflow

Now that your file system has been created and mounted to a compute instance, you can use it to run your high-performance compute workload.

You can create a data repository association to link your file system to an Amazon S3 data repository, For more information, see Linking your file system to an Amazon S3 bucket.

After you've linked your file system to an Amazon S3 data repository, you can export data that you've written to your file system back to your Amazon S3 bucket at any time. From a terminal on one of your compute instances, run the following command to export a file to your Amazon S3 bucket.


sudo lfs hsm_archive file_name

For more information on how to run this command on a folder or large collection of files quickly, see Exporting files using HSM commands.

Step 5: Clean up resources

After you have finished this exercise, you should follow these steps to clean up your resources and protect your AWS account.

To clean up resources

If you want to do a final export, run the following command.


nohup find /mnt/fsx -type f -print0 | xargs -0 -n 1 sudo lfs hsm_archive &

On the Amazon EC2 console, terminate your instance. For more information, see Terminate Your Instance in the Amazon EC2 User Guide.
On the Amazon FSx for Lustre console, delete your file system with the following procedure:
1. In the navigation pane, choose File systems.
2. Choose the file system that you want to delete from list of file systems on the dashboard.
3. For Actions, choose Delete file system.
4. In the dialog box that appears, choose if you want to take a final backup of the file system. Then provide the file system ID to confirm the deletion. Choose Delete file system.
If you created an Amazon S3 bucket for this exercise, and if you don't want to preserve the data you exported, you can now delete it. For more information, see Deleting a bucket in the Amazon Simple Storage Service User Guide.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Setting up

File system deployment options

Select your cookie preferences

Customize cookie preferences

Essential

Performance

Functional

Advertising

Unable to save cookie preferences

Getting started with Amazon FSx for Lustre

Topics

Prerequisites

Step 1: Create your FSx for Lustre file system

To create your file system

Important

Important

Step 2: Install and configure the Lustre client

To check the kernel version and download the Lustre client

Step 3: Mount the file system

To mount your file system

Note

Step 4: Run your workflow

Step 5: Clean up resources

To clean up resources

On this page

Did this page help you?

Next topic:

Previous topic:

Need help?