Troubleshooting
Use the following information to help you resolve issues that you might encounter when working with Amazon FSx for Lustre.
Topics
- File System Mount Fails Right Away
- File System Mount Hangs and Then Fails with Timeout Error
- Automatic Mounting Fails and the Instance Is Unresponsive
- File System Mount Using DNS Name Fails
- You can't access your file system
- Troubleshooting a misconfigured linked S3 bucket
- Cannot create a file system that is linked to an S3 bucket
File System Mount Fails Right Away
The file system mount command fails right away. The following code shows an example.
mount.lustre: mount fs-0123456789abcdef0.fsx.us-east-1.aws@tcp:/fsx at /lustre
failed: No such file or directory Is the MGS specification correct? Is the filesystem name correct?
This error can occur if you aren't using the correct mountname
value when
mounting a persistent or scratch 2 file system by using the mount
command. You can get the mountname
value from the response of the
describe-file-systems AWS CLI command or the DescribeFileSystems API operation.
File System Mount Hangs and Then Fails with Timeout Error
The file system mount command hangs for a minute or two, and then fails with a timeout error.
The following code shows an example.
sudo mount -t lustre
file_system_dns_name
@tcp:/mountname
/mnt/fsx
[2+ minute wait here] Connection timed out
This error can occur because the security groups for the Amazon EC2 instance or the file system aren't configured properly.
Action to Take
Make sure that your security groups for the file system have the inbound rules specified in Amazon VPC Security Groups.
Automatic Mounting Fails and the Instance Is Unresponsive
In some cases, automatic mounting might fail for a file system and your Amazon EC2 instance might stop responding.
This issue can occur if the _netdev
option wasn't declared. If
_netdev
is missing, your Amazon EC2 instance can stop responding. This
result is because network file systems need to be initialized after the compute instance
starts its networking.
Action to Take
If this issue occurs, contact AWS Support.
File System Mount Using DNS Name Fails
A file system mount that is using a Domain Name Service (DNS) name fails. The following code shows an example.
sudo mount -t lustre
file_system_dns_name
@tcp:/mountname
/mnt/fsx
mount.lustre: Can't parse NID 'file_system_dns_name
@tcp:/mountname
'
Action to Take
Check your virtual private cloud (VPC) configuration. If you are using a custom VPC, make sure that DNS settings are enabled. For more information, see Using DNS with Your VPC in the Amazon VPC User Guide.
To specify a DNS name in the mount
command, do the following:
-
Ensure that the Amazon EC2 instance is in the same VPC as your Amazon FSx for Lustre file system.
-
Connect your Amazon EC2 instance inside a VPC configured to use the DNS server provided by Amazon. For more information, see DHCP Options Sets in the Amazon VPC User Guide.
-
Ensure that the Amazon VPC of the connecting Amazon EC2 instance has DNS host names enabled. For more information, see Updating DNS Support for Your VPC in the Amazon VPC User Guide.
A file system mount that is using a Domain Name Service (DNS) name fails. The following code shows an example.
mount -t lustre
file_system_dns_name
@tcp:/mountname
/mnt/fsx
mount.lustre: mountfile_system_dns_name
@tcp:/mountname
at /mnt/fsx failed: Input/output error Is the MGS running?
Action to Take
Make sure that the client's VPC security groups have the correct outbound traffic rules applied. This recommendation holds true especially if you aren't using the default security group, or if you have modified the default security group. For more information, see Amazon VPC Security Groups.
You can't access your file system
There are a number of potential causes for being unable to access your file system, each with their own resolution, as follows.
The Elastic IP address attached to the file system elastic network interface was deleted
Amazon FSx doesn't support accessing file systems from the public Internet. Amazon FSx automatically detaches any Elastic IP address, which is a public IP address reachable from the Internet, that gets attached to a file system's elastic network interface.
The file system elastic network interface was modified or deleted
You must not modify or delete the file system's elastic network interface. Modifying or deleting the network interface can cause a permanent loss of connection between your VPC and your file system. Create a new file system, and do not modify or delete the FSx elastic network interface. For more information, see File System Access Control with Amazon VPC.
Troubleshooting a misconfigured linked S3 bucket
In some cases, an Amazon FSx for Lustre file system's linked S3 bucket might have a misconfigured data repository lifecycle state. For more information, see Data repository lifecycle state. A linked data repository can have a misconfigured lifecycle state under the following conditions:
Possible cause
This error can occur if Amazon FSx does not have the necessary AWS Identity and Access Management (IAM) permissions that are required to access the linked data repository. The required IAM permissions support the Amazon FSx for Lustre service-linked role that is used to access the specified Amazon S3 bucket on your behalf.
Action to Take
-
Ensure that your IAM entity (user, group, or role) has the appropriate permissions to create file systems. Doing this includes adding the permissions policy that supports the Amazon FSx for Lustre service-linked role. For more information, see Adding Permissions to Use Data Repositories in Amazon S3.
-
Using the Amazon FSx CLI or API, refresh the file system's
AutoImportPolicy
with theupdate-file-system
CLI command (UpdateFileSystem is the equivalent API action), as follows.aws fsx update-file-system \ --file-system-id fs-0123456789abcdef0 \ --lustre-configuration AutoImportPolicy=
the_existing_AutoImportPolicy
For more information about service-linked roles, see Using Service-Linked Roles for Amazon FSx for Lustre.
Possible Cause
This error can occur if the linked Amazon S3 data repository has an existing event
notification configuration with event types that overlap with the Amazon FSx event
notification configuration (s3:ObjectCreated:*
,
s3:ObjectRemoved:*
).
This can also occur if the Amazon FSx event notification configuration on the linked S3 bucket was deleted or modified.
Action to Take
-
Remove any existing event notification on the linked S3 bucket that uses either or both of the event types that the FSx event configuration uses,
s3:ObjectCreated:*
ands3:ObjectRemoved:*
. -
Please ensure that there is an S3 Event Notification Configuration in you linked S3 bucket with the name
FSx
, event typess3:ObjectCreated:*
ands3:ObjectRemoved:*
, and send to the SNS topic withARN:
.topic_arn_returned_in_API_response
-
Reapply the FSx event notification configuration on the S3 bucket by using the Amazon FSx CLI or API, to refresh the file system's
AutoImportPolicy
. Do so with theupdate-file-system
CLI command (UpdateFileSystem is the equivalent API action), as follows.aws fsx update-file-system \ --file-system-id fs-0123456789abcdef0 \ --lustre-configuration AutoImportPolicy=
the_existing_AutoImportPolicy
Cannot create a file system that is linked to an S3 bucket
If creating a new file system that is linked to an S3 bucket fails with an error message similar to the following.
User: arn:aws:iam::
012345678901
:user/username
is not authorized to perform: iam:PutRolePolicy on resource:resource ARN
This error can happen if you try to create a file system linked to an Amazon S3 bucket without the necessary IAM permissions. The required IAM permissions support the Amazon FSx for Lustre service-linked role that is used to access the specified Amazon S3 bucket on your behalf.
Action to Take
Ensure that your IAM entity (user, group, or role) has the appropriate permissions to create file systems. Doing this includes adding the permissions policy that supports the Amazon FSx for Lustre service-linked role. For more information, see Adding Permissions to Use Data Repositories in Amazon S3.
For more information about service-linked roles, see Using Service-Linked Roles for Amazon FSx for Lustre.