Amazon Elastic Inference
Developer Guide

The AWS Documentation website is getting a new look!
Try it now and let us know what you think. Switch to the new look >>

You can return to the original look by selecting English in the language selector above.

Amazon Elastic Inference Basics

When you configure an Amazon EC2 instance to launch with an Elastic Inference accelerator, AWS finds available accelerator capacity. It then establishes a network connection between your instance and the accelerator.

The following Elastic Inference accelerator types are available. You can attach any Elastic Inference accelerator type to any Amazon EC2 instance type.

Accelerator Type FP32 Throughput (TFLOPS) FP16 Throughput (TFLOPS) Memory (GB)
eia2.medium 1 8 2
eia2.large 2 16 4
eia2.xlarge 4 32 8

An Elastic Inference accelerator is not part of the hardware that makes up your instance. Instead, the accelerator is attached through the network using an AWS PrivateLink endpoint service. The endpoint service routes traffic from your instance to the Elastic Inference accelerator configured with your instance.

Before you launch an instance with an Elastic Inference accelerator, you must create an AWS PrivateLink endpoint service. Only a single endpoint service is needed in every Availability Zone to connect instances with Elastic Inference accelerators. For more information, see VPC Endpoint Services (AWS PrivateLink).


     
     An  accelerator attached to an EC2 instance.

You can use Amazon Elastic Inference enabled TensorFlow, TensorFlow Serving, or Apache MXNet libraries to load models and make inference calls. The modified versions of these frameworks automatically detect the presence of Elastic Inference accelerators. They then optimally distribute the model operations between the Elastic Inference accelerator and the CPU of the instance. The AWS Deep Learning AMIs include the latest releases of Amazon Elastic Inference enabled TensorFlow Serving and MXNet. If you are using custom AMIs or container images, you can download and install the required TensorFlow and Apache MXNet libraries from Amazon S3.

Note

An Elastic Inference accelerator is not visible or accessible through the management console of your instance.