[fsx] section - AWS ParallelCluster

[fsx] section

Defines configuration settings for an attached Amazon FSx for Lustre file system. For more information about Amazon FSx for Lustre, see Amazon FSx CreateFileSystem.

Amazon FSx for Lustre is supported if the base_os is alinux, alinux2, centos7, ubuntu1604, or ubuntu1804.

When using Amazon Linux, the kernel must be >= 4.14.104-78.84.amzn1.x86_64. For detailed instructions, see Installing the lustre client in the Amazon FSx for Lustre User Guide.

Note

Amazon FSx for Lustre is not currently supported when using awsbatch as a scheduler.

Note

Support for Amazon FSx for Lustre on alinux2, ubuntu1604, and ubuntu1804 was added in AWS ParallelCluster version 2.6.0. Support for Amazon FSx for Lustre on centos7 was added in AWS ParallelCluster version 2.4.0.

If using an existing file system, it must be associated to a security group that allows inbound TCP traffic to port 988. Setting the source to 0.0.0.0/0 on a security group rule provides client access from all IP ranges within your VPC security group for the protocol and port range for that rule. To further limit access to your file systems we recommend using more restrictive sources for your security group rules, for example more specific CIDR ranges, IP addresses, or security group IDs. This is done automatically when not using vpc_security_group_id.

To use an existing Amazon FSx file system, specify fsx_fs_id.

The format is [fsx fsx-name]. fsx-name must start with a letter, contain no more than 30 characters, and only contain letters, numbers, hyphens (-), and underscores (_).

[fsx fs] shared_dir = /fsx fsx_fs_id = fs-073c3803dca3e28a6

To create and configure a new file system, use the following parameters:

[fsx fs] shared_dir = /fsx storage_capacity = 3600 imported_file_chunk_size = 1024 export_path = s3://bucket/folder import_path = s3://bucket weekly_maintenance_start_time = 1:00:00

automatic_backup_retention_days

(Optional) Specifies the number of days to retain automatic backups. This is only valid for use with PERSISTENT_1 deployment types. When the automatic_backup_retention_days parameter is specified, the export_path, import_path, and imported_file_chunk_size parameters must not be specified. This corresponds to the AutomaticBackupRetentionDays property.

The default value is 0. This setting disables automatic backups. The possible values are integers between 0 and 35, inclusive.

automatic_backup_retention_days = 35
Note

Support for automatic_backup_retention_days was added in AWS ParallelCluster version 2.8.0.

Update policy: This setting can be changed during an update.

copy_tags_to_backups

(Optional) Specifies whether tags for the filesystem are copied to the backups. This is only valid for use with PERSISTENT_1 deployment types. When the copy_tags_to_backups parameter is specified, the automatic_backup_retention_days must be specified with a value greater than 0, and the export_path, import_path, and imported_file_chunk_size parameters must not be specified. This corresponds to the CopyTagsToBackups property.

The default value is false.

copy_tags_to_backups = true
Note

Support for copy_tags_to_backups was added in AWS ParallelCluster version 2.8.0.

Update policy: If this setting is changed, the update is not allowed.

daily_automatic_backup_start_time

(Optional) Specifies the time of day (UTC) to start automatic backups. This is only valid for use with PERSISTENT_1 deployment types. When the daily_automatic_backup_start_time parameter is specified, the automatic_backup_retention_days must be specified with a value greater than 0, and the export_path, import_path, and imported_file_chunk_size parameters must not be specified. This corresponds to the DailyAutomaticBackupStartTime property.

The format is HH:MM, where HH is the zero-padded hour of the day (0-23), and MM is the zero-padded minute of the hour. For example, 1:03 A.M. UTC would be:

daily_automatic_backup_start_time = 01:03

The default value is a random time between 00:00 and 23:59.

Note

Support for daily_automatic_backup_start_time was added in AWS ParallelCluster version 2.8.0.

Update policy: This setting can be changed during an update.

deployment_type

(Optional) Specifies the Amazon FSx for Lustre deployment type. This corresponds to the DeploymentType property. For more information, see Amazon FSx for Lustre deployment options in the Amazon FSx for Lustre User Guide. Choose a scratch deployment type for temporary storage and shorter-term processing of data. SCRATCH_2 is the latest generation of scratch file systems, and offers higher burst throughput over baseline throughput and also in-transit encryption of data.

The valid values are SCRATCH_1, SCRATCH_2, and PERSISTENT_1.

SCRATCH_1

The default deployment type for Amazon FSx for Lustre. With this deployment type, the storage_capacity setting has possible values of 1200, 2400, and any multiple of 3600. Support for SCRATCH_1 was added in AWS ParallelCluster version 2.4.0.

SCRATCH_2

The latest generation of scratch file systems that supports up to six times the baseline throughput for spiky workloads, and supports in-transit encryption of data for supported instance types in supported regions. For more information, see Encrypting data in transit in the Amazon FSx for Lustre User Guide. With this deployment type, the storage_capacity setting has possible values of 1200 and any multiple of 2400. Support for SCRATCH_2 was added in AWS ParallelCluster version 2.6.0.

PERSISTENT_1

Designed for longer-term storage. The file servers are highly available and the data is replicated within the file systems' AWS Availability Zone (AZ), and supports in-transit encryption of data for supported instance types. With this deployment type, the storage_capacity setting has possible values of 1200 and any multiple of 2400. Support for PERSISTENT_1 was added in AWS ParallelCluster version 2.6.0.

The default value is SCRATCH_1.

deployment_type = SCRATCH_2
Note

Support for deployment_type was added in AWS ParallelCluster version 2.6.0.

Update policy: If this setting is changed, the update is not allowed.

export_path

(Optional) Specifies the Amazon S3 path where the root of your file system is exported. The path must be in the same Amazon S3 bucket as the import_path parameter. When the export_path parameter is specified, the automatic_backup_retention_days, copy_tags_to_backups, daily_automatic_backup_start_time, and fsx_backup_id parameters must not be specified. This corresponds to the ExportPath property. File data and metadata is not automatically exported to the export_path. For information on exporting data and metadata, see Using Data Repository Tasks to Export Data and Metadata Changes in the Amazon FSx for Lustre User Guide.

The default value is s3://import-bucket/FSxLustre[creation-timestamp], where import-bucket is the bucket provided in the import_path parameter.

export_path = s3://bucket/folder

Update policy: If this setting is changed, the update is not allowed.

fsx_backup_id

(Optional) Specifies the ID of the backup to use for restoring the file system from an existing backup. When the fsx_backup_id parameter is specified, the deployment_type, export_path, fsx_kms_key_id, import_path, imported_file_chunk_size, storage_capacity, and per_unit_storage_throughput parameters must not be specified. These parameters are read from the backup. Additionally, the export_path, import_path, and imported_file_chunk_size parameters must not be specified.

This corresponds to the BackupId property.

fsx_backup_id = backup-fedcba98
Note

Support for fsx_backup_id was added in AWS ParallelCluster version 2.8.0.

Update policy: If this setting is changed, the update is not allowed.

fsx_fs_id

(Optional) Attaches an existing Amazon FSx for Lustre file system.

If this option is specified, only the shared_dir and fsx_fs_id settings in the [fsx] section are used and any other settings in the [fsx] section are ignored.

fsx_fs_id = fs-073c3803dca3e28a6

Update policy: If this setting is changed, the update is not allowed.

fsx_kms_key_id

(Optional) Specifies the key ID of your AWS Key Management Service (AWS KMS) customer managed key.

This key is used to encrypt the data in your file system at rest.

This must be used with a custom ec2_iam_role. For more information, see Disk encryption with a custom KMS Key. This corresponds to the KmsKeyId parameter in the Amazon FSx API Reference.

.

fsx_kms_key_id = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
Note

Support for fsx_kms_key_id was added in AWS ParallelCluster version 2.6.0.

Update policy: If this setting is changed, the update is not allowed.

import_path

(Optional) Specifies the S3 bucket to load data from into the file system and serve as the export bucket. For more information, see export_path. If you specify the import_path parameter, the automatic_backup_retention_days, copy_tags_to_backups, daily_automatic_backup_start_time, and fsx_backup_id parameters must not be specified. This corresponds to the ImportPath parameter in the Amazon FSx API Reference.

Import occurs on cluster creation. For more information, see Importing data from your data repository in the Amazon FSx for Lustre User Guide. On import, only file metadata (name, ownership, timestamp, and permissions) is imported. File data is not imported from the S3 bucket until the file is first accessed. For details on preloading the file contents, see Preloading files into your file system in the Amazon FSx for Lustre User Guide.

If a value is not provided, the file system is empty.

import_path = s3://bucket

Update policy: If this setting is changed, the update is not allowed.

imported_file_chunk_size

(Optional) Determines the stripe count and the maximum amount of data per file (in MiB) stored on a single physical disk for files that are imported from a data repository (using import_path). The maximum number of disks that a single file can be striped across is limited by the total number of disks that make up the file system. When the imported_file_chunk_size parameter is specified, the automatic_backup_retention_days, copy_tags_to_backups, daily_automatic_backup_start_time, and fsx_backup_id parameters must not be specified. This corresponds to the ImportedFileChunkSize property.

The chunk size default is 1024 (1 GiB), and it can go as high as 512,000 MiB (500 GiB). Amazon S3 objects have a maximum size of 5 TB.

imported_file_chunk_size = 1024

Update policy: If this setting is changed, the update is not allowed.

per_unit_storage_throughput

(Required for PERSISTENT_1 deployment types) For the deployment_type = PERSISTENT_1 deployment type, describes the amount of read and write throughput for each 1 tebibyte (TiB) of storage, in MB/s/TiB. File system throughput capacity is calculated by multiplying file system storage capacity (TiB) by the per_unit_storage_throughput (MB/s/TiB). For a 2.4 TiB file system, provisioning 50 MB/s/TiB of per_unit_storage_throughput yields 120 MB/s of file system throughput. You pay for the amount of throughput that you provision. This corresponds to the PerUnitStorageThroughput property.

The possible values are 50, 100, 200.

per_unit_storage_throughput = 200
Note

Support for per_unit_storage_throughput was added in AWS ParallelCluster version 2.6.0.

Update policy: If this setting is changed, the update is not allowed.

shared_dir

(Required) Defines the mount point for the Amazon FSx for Lustre file system on the head and compute nodes.

Do not use NONE or /NONE as the shared directory.

The following example mounts the file system at /fsx.

shared_dir = /fsx

Update policy: If this setting is changed, the update is not allowed.

storage_capacity

(Required) Specifies the storage capacity of the file system, in GiB. This corresponds to the StorageCapacity property.

The storage capacity possible values vary based on the deployment_type setting.

SCRATCH_1

The possible values are 1200, 2400, and any multiple of 3600.

SCRATCH_2 and PERSISTENT_1

The possible values are 1200 and any multiple of 2400.

storage_capacity = 7200
Note

For AWS ParallelCluster version 2.5.0 and 2.5.1, storage_capacity supported possible values of 1200, 2400, and any multiple of 3600. For versions earlier than AWS ParallelCluster version 2.5.0, storage_capacity had a minimum size of 3600.

Update policy: If this setting is changed, the update is not allowed.

weekly_maintenance_start_time

(Optional) Specifies a preferred time to perform weekly maintenance, in the UTC time zone. This corresponds to the WeeklyMaintenanceStartTime property.

The format is [day of week]:[hour of day]:[minute of hour]. For example, Monday at Midnight is:

weekly_maintenance_start_time = 1:00:00

Update policy: This setting can be changed during an update.