Elastic Fabric Adapter
An Elastic Fabric Adapter (EFA) is a network device to accelerate High Performance Computing (HPC) applications. AWS Batch supports applications that use EFA if the following conditions are met.
-
Compute environment contains only supported instance types (
c5n.18xlarge
,c5n.metal
,i3en.24xlarge
,m5dn.24xlarge
,m5n.24xlarge
,r5dn.24xlarge
,r5n.24xlarge
, andp3dn.24xlarge
). -
The OS in the AMI supports EFA: Amazon Linux, Amazon Linux 2, Red Hat Enterprise Linux 7.6, CentOS 7.6, Ubuntu 16.04, Ubuntu 18.04.
-
The AMI has the EFA driver loaded.
-
The security group for the EFA must allows all inbound and outbound traffic to and from the security group itself.
-
All instances that use an EFA should be in the same cluster placement group.
-
The job definition must include a
devices
member withhostPath
set to/dev/infiniband/uverbs0
to allow the EFA device to be passed through to the container. IfcontainerPath
is specified it must also be set to/dev/infiniband/uverbs0
. Ifpermissions
is set it must be set toREAD
|WRITE
|MKNOD
.The location of the LinuxParameters member will be different for multi-node parallel jobs and single-node container jobs. The examples below demonstrate the differences but are missing required values.
Example for multi-node parallel job
{ "jobDefinitionName": "EFA-MNP-JobDef", "type": "multinode", "nodeProperties": { ... "nodeRangeProperties": [ { ... "container": { ... "linuxParameters": { "devices": [ { "hostPath": "/dev/infiniband/uverbs0", "containerPath": "/dev/infiniband/uverbs0", "permissions": [ "READ", "WRITE", "MKNOD" ] }, ], }, }, }, ], }, }
Example for single-node container job
{ "jobDefinitionName": "EFA-Container-JobDef", "type": "container", ... "containerProperties": { ... "linuxParameters": { "devices": [ { "hostPath": "/dev/infiniband/uverbs0", }, ], }, }, }
For more information about EFA, see Elastic Fabric Adapter in Amazon EC2 User Guide for Linux Instances.