Amazon FSx Construct Library
Amazon FSx provides fully managed third-party file systems with the native compatibility and feature sets for workloads such as Microsoft Windows–based storage, high-performance computing, machine learning, and electronic design automation.
Amazon FSx supports two file system types: Lustre and Windows File Server.
FSx for Lustre
Amazon FSx for Lustre makes it easy and cost-effective to launch and run the popular, high-performance Lustre file system. You use Lustre for workloads where speed matters, such as machine learning, high performance computing (HPC), video processing, and financial modeling.
The open-source Lustre file system is designed for applications that require fast storage—where you want your storage to keep up with your compute. Lustre was built to solve the problem of quickly and cheaply processing the world’s ever-growing datasets. It’s a widely used file system designed for the fastest computers in the world. It provides submillisecond latencies, up to hundreds of GBps of throughput, and up to millions of IOPS. For more information on Lustre, see the Lustre website.
As a fully managed service, Amazon FSx makes it easier for you to use Lustre for workloads where storage speed matters. Amazon FSx for Lustre eliminates the traditional complexity of setting up and managing Lustre file systems, enabling you to spin up and run a battle-tested high-performance file system in minutes. It also provides multiple deployment options so you can optimize cost for your needs.
Amazon FSx for Lustre is POSIX-compliant, so you can use your current Linux-based applications without having to make any changes. Amazon FSx for Lustre provides a native file system interface and works as any file system does with your Linux operating system. It also provides read-after-write consistency and supports file locking.
Installation
Import to your project:
import aws_cdk.aws_fsx as fsx
Basic Usage
Setup required properties and create:
# vpc: ec2.Vpc
file_system = fsx.LustreFileSystem(self, "FsxLustreFileSystem",
lustre_configuration=fsx.LustreConfiguration(deployment_type=fsx.LustreDeploymentType.SCRATCH_2),
storage_capacity_gi_b=1200,
vpc=vpc,
vpc_subnet=vpc.private_subnets[0]
)
File System Type Version
You can set the Lustre version for the file system. To do this, use the fileSystemTypeVersion
property:
# vpc: ec2.Vpc
file_system = fsx.LustreFileSystem(self, "FsxLustreFileSystem",
lustre_configuration=fsx.LustreConfiguration(deployment_type=fsx.LustreDeploymentType.SCRATCH_2),
storage_capacity_gi_b=1200,
vpc=vpc,
vpc_subnet=vpc.private_subnets[0],
file_system_type_version=fsx.FileSystemTypeVersion.V_2_15
)
Note: The fileSystemTypeVersion
has a restrictions on the values that can be set based on the deploymentType
.
V_2_10
is supported by the Scratch andPERSISTENT_1
deployment types.V_2_12
is supported by all Lustre deployment types.V_2_15
is supported by all Lustre deployment types and is recommended for all new file systems.
Note: The default value of fileSystemTypeVersion
is V_2_10
except for PERSISTENT_2
deployment type where the default value is V_2_12
.
Connecting
To control who can access the file system, use the .connections
attribute. FSx has a fixed default port, so you don’t
need to specify the port. This example allows an EC2 instance to connect to a file system:
# file_system: fsx.LustreFileSystem
# instance: ec2.Instance
file_system.connections.allow_default_port_from(instance)
Mounting
The LustreFileSystem Construct exposes both the DNS name of the file system as well as its mount name, which can be used to mount the file system on an EC2 instance. The following example shows how to bring up a file system and EC2 instance, and then use User Data to mount the file system on the instance at start-up:
import aws_cdk.aws_iam as iam
# vpc: ec2.Vpc
lustre_configuration = {
"deployment_type": fsx.LustreDeploymentType.SCRATCH_2
}
fs = fsx.LustreFileSystem(self, "FsxLustreFileSystem",
lustre_configuration=lustre_configuration,
storage_capacity_gi_b=1200,
vpc=vpc,
vpc_subnet=vpc.private_subnets[0]
)
inst = ec2.Instance(self, "inst",
instance_type=ec2.InstanceType.of(ec2.InstanceClass.T2, ec2.InstanceSize.LARGE),
machine_image=ec2.AmazonLinuxImage(
generation=ec2.AmazonLinuxGeneration.AMAZON_LINUX_2
),
vpc=vpc,
vpc_subnets=ec2.SubnetSelection(
subnet_type=ec2.SubnetType.PUBLIC
)
)
fs.connections.allow_default_port_from(inst)
# Need to give the instance access to read information about FSx to determine the file system's mount name.
inst.role.add_managed_policy(iam.ManagedPolicy.from_aws_managed_policy_name("AmazonFSxReadOnlyAccess"))
mount_path = "/mnt/fsx"
dns_name = fs.dns_name
mount_name = fs.mount_name
inst.user_data.add_commands("set -eux", "yum update -y", "amazon-linux-extras install -y lustre2.10", f"mkdir -p {mountPath}", f"chmod 777 {mountPath}", f"chown ec2-user:ec2-user {mountPath}", f"echo \"{dnsName}@tcp:/{mountName} {mountPath} lustre defaults,noatime,flock,_netdev 0 0\" >> /etc/fstab", "mount -a")
Importing an existing Lustre filesystem
An FSx for Lustre file system can be imported with fromLustreFileSystemAttributes(this, id, attributes)
. The
following example lays out how you could import the SecurityGroup a file system belongs to, use that to import the file
system, and then also import the VPC the file system is in and add an EC2 instance to it, giving it access to the file
system.
sg = ec2.SecurityGroup.from_security_group_id(self, "FsxSecurityGroup", "{SECURITY-GROUP-ID}")
fs = fsx.LustreFileSystem.from_lustre_file_system_attributes(self, "FsxLustreFileSystem",
dns_name="{FILE-SYSTEM-DNS-NAME}",
file_system_id="{FILE-SYSTEM-ID}",
security_group=sg
)
vpc = ec2.Vpc.from_vpc_attributes(self, "Vpc",
availability_zones=["us-west-2a", "us-west-2b"],
public_subnet_ids=["{US-WEST-2A-SUBNET-ID}", "{US-WEST-2B-SUBNET-ID}"],
vpc_id="{VPC-ID}"
)
inst = ec2.Instance(self, "inst",
instance_type=ec2.InstanceType.of(ec2.InstanceClass.T2, ec2.InstanceSize.LARGE),
machine_image=ec2.AmazonLinuxImage(
generation=ec2.AmazonLinuxGeneration.AMAZON_LINUX_2
),
vpc=vpc,
vpc_subnets=ec2.SubnetSelection(
subnet_type=ec2.SubnetType.PUBLIC
)
)
fs.connections.allow_default_port_from(inst)
Lustre Data Repository Association support
The LustreFilesystem Construct supports one Data Repository Association (DRA) to an S3 bucket. This allows Lustre hierarchical storage management to S3 buckets, which in turn makes it possible to use S3 as a permanent backing store, and use FSx for Lustre as a temporary high performance cache.
Note: CloudFormation does not currently support for PERSISTENT_2
filesystems, and so neither does CDK.
The following example illustrates setting up a DRA to an S3 bucket, including automated metadata import whenever a file is changed, created or deleted in the S3 bucket:
from aws_cdk import aws_s3 as s3
# vpc: ec2.Vpc
# bucket: s3.Bucket
lustre_configuration = {
"deployment_type": fsx.LustreDeploymentType.SCRATCH_2,
"export_path": bucket.s3_url_for_object(),
"import_path": bucket.s3_url_for_object(),
"auto_import_policy": fsx.LustreAutoImportPolicy.NEW_CHANGED_DELETED
}
fs = fsx.LustreFileSystem(self, "FsxLustreFileSystem",
vpc=vpc,
vpc_subnet=vpc.private_subnets[0],
storage_capacity_gi_b=1200,
lustre_configuration=lustre_configuration
)
Compression
By default, transparent compression of data within FSx for Lustre is switched off. To enable it, add the following to your lustreConfiguration
:
lustre_configuration = {
# ...
"data_compression_type": fsx.LustreDataCompressionType.LZ4
}
When you turn data compression on for an existing file system, only newly written files are compressed. Existing files are not compressed. For more information, see Compressing previously written files.
Backups
You can take daily automatic backups by setting automaticBackupRetention
to a non-zero day in the lustreConfiguration
.
Additionally, you can set the backup window by specifying the dailyAutomaticBackupStartTime
.
import aws_cdk as cdk
lustre_configuration = {
# ...
"automatic_backup_retention": cdk.Duration.days(3), # backup retention
"copy_tags_to_backups": True, # if true, tags are copied to backups
"daily_automatic_backup_start_time": fsx.DailyAutomaticBackupStartTime(hour=11, minute=30)
}
For more information, see Working with backups .
Storage Type
By default, FSx for Lustre uses SSD storage. To use HDD storage, specify storageType
:
# vpc: ec2.Vpc
file_system = fsx.LustreFileSystem(self, "FsxLustreFileSystem",
lustre_configuration=fsx.LustreConfiguration(deployment_type=fsx.LustreDeploymentType.PERSISTENT_1),
storage_capacity_gi_b=1200,
vpc=vpc,
vpc_subnet=vpc.private_subnets[0],
storage_type=fsx.StorageType.HDD
)
Note: The HDD storage type is only supported for PERSISTENT_1
deployment types.
To improve the performance of frequently accessed files by caching up to 20% of the total storage capacity of the file system, set driveCacheType
to READ
:
# vpc: ec2.Vpc
file_system = fsx.LustreFileSystem(self, "FsxLustreFileSystem",
lustre_configuration=fsx.LustreConfiguration(
deployment_type=fsx.LustreDeploymentType.PERSISTENT_1,
drive_cache_type=fsx.DriveCacheType.READ
),
storage_capacity_gi_b=1200,
vpc=vpc,
vpc_subnet=vpc.private_subnets[0],
storage_type=fsx.StorageType.HDD
)
FSx for Windows File Server
The L2 construct for the FSx for Windows File Server has not yet been implemented. To instantiate an FSx for Windows file system, the L1 constructs can be used as defined by CloudFormation.