Supported Instance Types and Frameworks
Amazon SageMaker Neo supports popular deep learning frameworks for both compilation and deployment. You can deploy your model to cloud instances or AWS Inferentia instance types.
The following describes frameworks SageMaker Neo supports and the target cloud instances you can compile and deploy to. For information on how to deploy your compiled model to a cloud or Inferentia instance, see Deploy a Model with Cloud Instances.
Cloud Instances
SageMaker Neo supports the following deep learning frameworks for CPU and GPU cloud instances:
Framework | Framework Version | Model Version | Models | Model Formats (packaged in *.tar.gz) | Toolkits |
---|---|---|---|---|---|
MXNet | 1.8.0 | Supports 1.8.0 or earlier | Image Classification, Object Detection, Semantic Segmentation, Pose Estimation, Activity Recognition | One symbol file (.json) and one parameter file (.params) | GluonCV v0.8.0 |
ONNX | 1.7.0 | Supports 1.7.0 or earlier | Image Classification, SVM | One model file (.onnx) | |
Keras | 2.2.4 | Supports 2.2.4 or earlier | Image Classification | One model definition file (.h5) | |
PyTorch | 1.4, 1.5, 1.6, 1.7, 1.8, 1.12, 1.13, or 2.0 | Supports 1.4, 1.5, 1.6, 1.7, 1.8, 1.12, 1.13, and 2.0 |
Image Classification Versions 1.13 and 2.0 support Object Detection, Vision Transformer, and HuggingFace |
One model definition file (.pt or .pth) with input dtype of float32 | |
TensorFlow | 1.15.3 or 2.9 | Supports 1.15.3 and 2.9 | Image Classification | For saved models, one .pb or one .pbtxt file and a variables directory that contains variables For frozen models, only one .pb or .pbtxt file |
|
XGBoost | 1.3.3 | Supports 1.3.3 or earlier | Decision Trees | One XGBoost model file (.model) where the number of nodes in a tree is less than 2^31 |
Note
“Model Version” is the version of the framework used to train and export the model.
Instance Types
You can deploy your SageMaker AI compiled model to one of the cloud instances listed below:
Instance | Compute Type |
---|---|
|
Standard |
|
Standard |
|
Standard |
|
Standard |
|
Accelerated computing |
|
Accelerated computing |
|
Accelerated computing |
For information on the available vCPU, memory, and price per hour
for each instance type, see
Amazon
SageMaker Pricing
Note
When compiling for ml_*
instances using PyTorch framework, use Compiler options field in Output Configuration to provide the
correct data type (dtype
) of the model’s input.
The default is set to "float32"
.
AWS Inferentia
SageMaker Neo supports the following deep learning frameworks for Inf1:
Framework | Framework Version | Model Version | Models | Model Formats (packaged in *.tar.gz) | Toolkits |
---|---|---|---|---|---|
MXNet | 1.5 or 1.8 | Supports 1.8, 1.5 and earlier | Image Classification, Object Detection, Semantic Segmentation, Pose Estimation, Activity Recognition | One symbol file (.json) and one parameter file (.params) | GluonCV v0.8.0 |
PyTorch | 1.7, 1.8 or 1.9 | Supports 1.9 and earlier | Image Classification | One model definition file (.pt or .pth) with input dtype of float32 | |
TensorFlow | 1.15 or 2.5 | Supports 2.5, 1.15 and earlier | Image Classification | For saved models, one .pb or one .pbtxt file and a variables directory that contains variables For frozen models, only one .pb or .pbtxt file |
Note
“Model Version” is the version of the framework used to train and export the model.
You can deploy your SageMaker Neo-compiled model to AWS Inferentia-based Amazon EC2 Inf1
instances. AWS Inferentia is Amazon's first custom silicon chip designed to accelerate
deep learning. Currently, you can use the ml_inf1
instance to deploy your
compiled models.
AWS Inferentia2 and AWS Trainium
Currently, you can deploy your SageMaker Neo-compiled model to AWS Inferentia2-based
Amazon EC2 Inf2 instances (in US East (Ohio) Region), and to AWS Trainium-based Amazon EC2 Trn1
instances (in US East (N. Virginia) Region). For more information about supported models
on these instances, see
Model Architecture Fit Guidelines