Using the DLAMI with AWS Neuron

A typical workflow with the AWS Neuron SDK is to compile a previously trained machine learning model on a compilation server. After this, distribute the artifacts to the Inf1 instances for execution. AWS Deep Learning AMIs (DLAMI) comes pre-installed with everything you need to compile and run inference in an Inf1 instance that uses Inferentia.

The following sections describe how to use the DLAMI with Inferentia.

Using TensorFlow-Neuron and the AWS Neuron Compiler
Using AWS Neuron TensorFlow Serving
Using MXNet-Neuron and the AWS Neuron Compiler
Using MXNet-Neuron Model Serving
Using PyTorch-Neuron and the AWS Neuron Compiler

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Launching DLAMI with Neuron

TensorFlow and AWS Neuron Compiler

Using the DLAMI with AWS Neuron

Contents