Deploy a Model Compiled with Neo with Hosting Services - Amazon SageMaker

Deploy a Model Compiled with Neo with Hosting Services

To deploy a Neo-compiled model to an HTTPS endpoint, you must configure and create the endpoint for the model using Amazon SageMaker hosting services. Currently developers can use Amazon SageMaker APIs to deploy modules on to ml.c5, ml.c4, ml.m5, ml.m4, ml.p3, ml.p2, and ml.inf1 instances.

For Inf1 instances, models need to be compiled specifically for ml.inf1 instances. Models compiled for other instance types are not guaranteed to work with ml.inf1 instances. For more information on compiling your model, see Use Neo to Compile a Model.

When you deploy a compiled model, you need to use the same instance for the target that you used for compilation. This creates an SageMaker endpoint that you can use to perform inferences. You can deploy a Neo-compiled model in any of the following ways: Amazon SageMaker SDK for Python, SDK for Python (Boto3), AWS Command Line Interface, SageMaker Console.

Note

For deploying a model using AWS CLI, the Console, or Boto3, see Neo Inference Container Images to select the inference image URI for your primary container.