What Is Amazon Elastic Inference?
Amazon Elastic Inference (Elastic Inference) is a resource you can attach to your Amazon Elastic Compute Cloud CPU instances, Amazon Deep Learning Containers, and SageMaker instances. Elastic Inference helps you accelerate your deep learning (DL) inference workloads. Elastic Inference accelerators come in multiple sizes and help you build intelligent capabilities into your applications.
Elastic Inference distributes model operations defined by TensorFlow, Apache MXNet (MXNet), and PyTorch between lowcost, DL inference accelerators and the CPU of the instance. Elastic Inference also supports the open neural network exchange (ONNX) format through MXNet.
Prerequisites
You need an Amazon Web Services account and should be familiar with launching an Amazon EC2, Amazon Deep Learning Containers, or SageMaker instances to successfully run Amazon Elastic Inference. To launch an Amazon EC2 instance, complete the steps in Setting up with Amazon EC2. Amazon S3 resources are required for installing packages via pip. For more information about setting up Amazon S3 resources, see the Amazon Simple Storage Service User Guide.
Pricing for Amazon Elastic Inference
You are charged for each second that an Elastic Inference accelerator is attached to an
instance in the running
state. You are not charged for an accelerator
attached to an instance that is in the pending
, stopping
,
stopped
, shuttingdown
, or terminated
state. You are also not charged when an Elastic Inference accelerator is in the
unknown
or impaired
state.
You do not incur AWS PrivateLink charges for VPC endpoints to the Elastic Inference service when you have accelerators provisioned in the subnet.
For more information about pricing by Region for Elastic Inference, see Elastic Inference Pricing
Elastic Inference Uses
You can use Elastic Inference in the following use cases:

For Elastic Inferenceenabled TensorFlow and TensorFlow 2 with Python, see Using TensorFlow Models with Elastic Inference

For Elastic Inferenceenabled MXNet with Python, Java, and Scala, see Using MXNet Models with Elastic Inference

For Elastic Inferenceenabled PyTorch with Python, see Using PyTorch Models with Elastic Inference

For Elastic Inference with SageMaker, see MXNet Elastic Inference with SageMaker
For Amazon Deep Learning Containers with Elastic Inference on Amazon EC2, Amazon ECS, and SageMaker, see Using Amazon Deep Learning Containers With Elastic Inference

For security information on Elastic Inference, see Security in Amazon Elastic Inference

To troubleshoot your Elastic Inference workflow, see Troubleshooting
Next Up
Amazon Elastic Inference Basics