Elastic Fabric Adapter - AWS ParallelCluster

Elastic Fabric Adapter

Elastic Fabric Adapter (EFA) is a network device that has OS-bypass capabilities for low-latency network communications with other instances on the same subnet. EFA is exposed by using Libfabric, and can be used by applications using the Messaging Passing Interface (MPI).

To use EFA with AWS ParallelCluster and a Slurm scheduler, set SlurmQueues / ComputeResources / Efa / Enabled to true.

To view the list of EC2 instances that support EFA, see Supported instance types in the Amazon EC2 User Guide for Linux Instances.

We recommend that you run your EFA-enabled instances in a placement group. This way the instances are launched into a low-latency group in a single Availability Zone. For more information on how to configure placement groups with AWS ParallelCluster, see SlurmQueues / Networking / PlacementGroup.

For more information, see Elastic Fabric Adapter in the Amazon EC2 User Guide for Linux Instances and Scale HPC workloads with elastic fabric adapter and AWS ParallelCluster in the AWS Open Source Blog.

Note

Elastic Fabric Adapter (EFA) isn't supported over different availability zones. For more information, see Scheduling / SlurmQueues / Networking / SubnetIds.

Note

By default, Ubuntu distributions enable ptrace (process trace) protection. ptrace protection is disabled so that Libfabric works properly. For more information, see Disable ptrace protection in the Amazon EC2 User Guide for Linux Instances.