Improve network performance between EC2 instances with ENA Express - Amazon Elastic Compute Cloud

Improve network performance between EC2 instances with ENA Express

ENA Express is powered by AWS Scalable Reliable Datagram (SRD) technology. SRD is a high performance network transport protocol that uses dynamic routing to increase throughput and minimize tail latency. With ENA Express, you can communicate between two EC2 instances in the same Availability Zone.

Benefits of ENA Express
  • Increases the maximum bandwidth a single flow can use from 5 Gbps to 25 Gbps within the Availability Zone, up to the aggregate instance limit.

  • Reduces tail latency of network traffic between EC2 instances, especially during periods of high network load.

  • Detects and avoids congested network paths.

  • Handles some tasks directly in the network layer, such as packet reordering on the receiving end, and most retransmits that are needed. This frees up the application layer for other work.

Note
  • If your application sends or receives a high volume of packets per second, and needs to optimize for latency most of the time, especially during periods when there is no congestion on the network, Enhanced networking might be a better fit for your network.

  • ENA Express traffic can't be sent across subnets in a Local Zone.

After you've enabled ENA Express for the network interface attachment on an instance, the sending instance initiates communication with the receiving instance, and SRD detects if ENA Express is operating on both the sending instance and the reciving instance. If ENA Express is operating, the communication can use SRD transmission. If ENA Express is not operating, the communication falls back to standard ENA transmission.

During periods of time when network traffic is light, you might notice a slight increase in packet latency (tens of microseconds) when the packet uses ENA Express. During those times, applications that prioritize specific network performance characteristics can benefit from ENA Express as follows:

  • Processes can benefit from increased maximum single flow bandwidth from 5 Gbps to 25 Gbps within the same Availability Zone, up to the aggregate instance limit. For example, if a specific instance type supports up to 12.5 Gbps, the single flow bandwidth is also limited to 12.5 Gbps.

  • Longer running processes should experience reduced tail latency during periods of network congestion.

  • Processes can benefit from a smoother and more standard distribution for network response times.

How ENA Express works

ENA Express is powered by AWS Scalable Reliable Datagram (SRD) technology. It distributes packets for each network flow across different AWS network paths, and dynamically adjusts distribution when it detects signs of congestion. It also manages packet reordering on the receiving end.

To ensure that ENA Express can manage network traffic as intended, sending and receiving instances and the communication between them must meet all of the following requirements:

  • Both sending and receiving instance types are supported. See the Supported instance types for ENA Express table for more information.

  • Both sending and receiving instances must have ENA Express configured. If there are differences in the configuration, you can run into situations where traffic defaults to standard ENA transmission. The following scenario shows what can happen.

    Scenario: Differences in configuration

    Instance ENA Express Enabled UDP uses ENA Express
    Instance 1 Yes Yes
    Instance 2 Yes No

    In this case, TCP traffic between the two instances can use ENA Express, as both instances have enabled it. However, since one of the instances does not use ENA Express for UDP traffic, communication between these two instances over UDP uses standard ENA transmission.

  • The sending and receiving instances must run in the same Availability Zone.

  • The network path between the instances must not include middleware boxes. ENA Express doesn't currently support middleware boxes.

  • (Linux instances only) To utilize full bandwidth potential, use driver version 2.2.9 or higher.

  • (Linux instances only) To produce metrics, use driver version 2.8 or higher.

If any requirement is unmet, the instances use the standard TCP/UDP protocol but without SRD to communicate.

To ensure that your instance network driver is configured for optimum performance, review the recommended best practices for ENA drivers. These best practices apply to ENA Express, as well. For more information, see the ENA Linux Driver Best Practices and Performance Optimization Guide on the GitHub website.

Note

Amazon EC2 refers to the relationship between an instance and a network interface that's attached to it as an attachment. ENA Express settings apply to the attachment. If the network interface is detached from the instance, the attachment no longer exists, and the ENA Express settings that applied to it are no longer in force. The same is true when an instance is terminated, even if the network interface remains.

After you've enabled ENA Express for the network interface attachments on both the sending instance and the receiving instance, you can use ENA Express metrics to help ensure that your instances take full advantage of the performance improvements that SRD technology provides. For more information about ENA Express metrics, see Metrics for ENA Express.

Supported instance types for ENA Express

The following tabs show instance types that support ENA Express.

General purpose
Instance type Architecture
m6a.12xlarge x86_64
m6a.16xlarge x86_64
m6a.24xlarge x86_64
m6a.32xlarge x86_64
m6a.48xlarge x86_64
m6a.metal x86_64
m6i.8xlarge x86_64
m6i.12xlarge x86_64
m6i.16xlarge x86_64
m6i.24xlarge x86_64
m6i.32xlarge x86_64
m6i.metal x86_64
m6id.8xlarge x86_64
m6id.12xlarge x86_64
m6id.16xlarge x86_64
m6id.24xlarge x86_64
m6id.32xlarge x86_64
m6id.metal x86_64
m7g.12xlarge arm64
m7g.16xlarge arm64
m7g.metal arm64
m7gd.12xlarge arm64
m7gd.16xlarge arm64
m7gd.metal arm64
m7i.12xlarge x86_64
m7i.16xlarge x86_64
m7i.24xlarge x86_64
m7i.48xlarge x86_64
m7i.metal-24xl x86_64
m7i.metal-48xl x86_64
Compute optimized
Instance type Architecture
c6a.12xlarge x86_64
c6a.16xlarge x86_64
c6a.24xlarge x86_64
c6a.32xlarge x86_64
c6a.48xlarge x86_64
c6a.metal x86_64
c6gn.16xlarge arm64
c6i.8xlarge x86_64
c6i.12xlarge x86_64
c6i.16xlarge x86_64
c6i.24xlarge x86_64
c6i.32xlarge x86_64
c6i.metal x86_64
c6id.8xlarge x86_64
c6id.12xlarge x86_64
c6id.16xlarge x86_64
c6id.24xlarge x86_64
c6id.32xlarge x86_64
c6id.metal x86_64
c7g.12xlarge arm64
c7g.16xlarge arm64
c7g.metal arm64
c7gd.12xlarge arm64
c7gd.16xlarge arm64
c7gd.metal arm64
c7i.12xlarge x86_64
c7i.16xlarge x86_64
c7i.24xlarge x86_64
c7i.48xlarge x86_64
c7i.metal-24xl x86_64
c7i.metal-48xl x86_64
Memory optimized
Instance type Architecture
r6a.12xlarge x86_64
r6a.16xlarge x86_64
r6a.24xlarge x86_64
r6a.32xlarge x86_64
r6a.48xlarge x86_64
r6a.metal x86_64
r6i.8xlarge x86_64
r6i.12xlarge x86_64
r6i.16xlarge x86_64
r6i.24xlarge x86_64
r6i.32xlarge x86_64
r6i.metal x86_64
r6id.8xlarge x86_64
r6id.12xlarge x86_64
r6id.16xlarge x86_64
r6id.24xlarge x86_64
r6id.32xlarge x86_64
r6id.metal x86_64
r7g.12xlarge arm64
r7g.16xlarge arm64
r7g.metal arm64
r7gd.12xlarge arm64
r7gd.16xlarge arm64
r7gd.metal arm64
r7i.12xlarge x86_64
r7i.16xlarge x86_64
r7i.24xlarge x86_64
r7i.48xlarge x86_64
r7i.metal-24xl x86_64
r7i.metal-48xl x86_64
r8g.12xlarge arm64
r8g.16xlarge arm64
r8g.24xlarge arm64
r8g.48xlarge arm64
r8g.metal-24xl arm64
r8g.metal-48xl arm64
u7i-12tb.224xlarge x86_64
u7in-16tb.224xlarge x86_64
u7in-24tb.224xlarge x86_64
u7in-32tb.224xlarge x86_64
x2idn.16xlarge x86_64
x2idn.24xlarge x86_64
x2idn.32xlarge x86_64
x2idn.metal x86_64
x2iedn.8xlarge x86_64
x2iedn.16xlarge x86_64
x2iedn.24xlarge x86_64
x2iedn.32xlarge x86_64
x2iedn.metal x86_64
Accelerated computing
Instance type Architecture
g6.48xlarge x86_64
Storage optimized
Instance type Architecture
i4g.4xlarge arm64
i4g.8xlarge arm64
i4g.16xlarge arm64
i4i.8xlarge x86_64
i4i.12xlarge x86_64
i4i.16xlarge x86_64
i4i.24xlarge x86_64
i4i.32xlarge x86_64
i4i.metal x86_64
im4gn.4xlarge arm64
im4gn.8xlarge arm64
im4gn.16xlarge arm64

Prerequisites for Linux instances

To ensure that ENA Express can operate effectively, update the settings for your Linux instance as follows.

  • If your instance uses jumbo frames, run the following command to set your maximum transmission unit (MTU) to 8900.

    [ec2-user ~]$ sudo ip link set dev eth0 mtu 8900
  • Increase the receiver (Rx) ring size, as follows:

    [ec2-user ~]$ ethtool -G device rx 8192
  • To maximize ENA Express bandwidth, configure your TCP queue limits as follows:

    1. Set the TCP small queue limit to 1MB or higher. This increases the amount of data that's queued for transmission on a socket.

      sudo sh -c 'echo 1048576 > /proc/sys/net/ipv4/tcp_limit_output_bytes'
    2. Disable byte queue limits on the eth device if they're enabled for your Linux distribution. This increases data queued for transmission for the device queue.

      sudo sh -c 'for txq in /sys/class/net/eth0/queues/tx-*; do echo max > ${txq}/byte_queue_limits/limit_min; done'
      Note

      The ENA driver for the Amazon Linux distribution disables byte queue limits by default.

Tune performance for ENA Express settings on Linux instances

To check your Linux instance configuration for optimal ENA Express performance, you can run the following script that's available on the Amazon GitHub repository:

https://github.com/amzn/amzn-ec2-ena-utilities/blob/main/ena-express/check-ena-express-settings.sh

The script runs a series of tests and suggests both recommended and required configuration changes.