Amazon SageMaker
Developer Guide

The AWS Documentation website is getting a new look!
Try it now and let us know what you think. Switch to the new look >>

You can return to the original look by selecting English in the language selector above.

Deploy a Model Compiled with Neo (Console)

You can create a Neo endpoint in the Amazon SageMaker console. Open the Amazon SageMaker console at https://console.aws.amazon.com/sagemaker/.

Choose Models, and then choose Create models from the Inference group. On the Create model page, complete the Model name, IAM role, and, if needed, VPC fields.


                        Create Neo Model for Inference

To add information about the container used to deploy your model, choose Add container, then choose Next. Complete the Container input options, Location of inference code image, and Location of model artifacts, and optionally, Container host name, and Environmental variables fields.


                        Create Neo Model for Inference

To deploy Neo-compiled models, choose the following:

  • Container input options: Provide model artifacts and inference image

  • Location of inference code image: Choose one of the following images, depending the region and kind of application:

    • Amazon SageMaker Image Classification

      • 301217895009.dkr.ecr.us-west-2.amazonaws.com/image-classification-neo:latest

      • 785573368785.dkr.ecr.us-east-1.amazonaws.com/image-classification-neo:latest

      • 007439368137.dkr.ecr.us-east-2.amazonaws.com/image-classification-neo:latest

      • 802834080501.dkr.ecr.eu-west-1.amazonaws.com/image-classification-neo:latest

    • Amazon SageMaker XGBoost

      • 301217895009.dkr.ecr.us-west-2.amazonaws.com/xgboost-neo:latest

      • 785573368785.dkr.ecr.us-east-1.amazonaws.com/xgboost-neo:latest

      • 007439368137.dkr.ecr.us-east-2.amazonaws.com/xgboost-neo:latest

      • 802834080501.dkr.ecr.eu-west-1.amazonaws.com/xgboost-neo:latest

    • TensorFlow : The TensorFlow version used must be in TensorFlow SageMaker Estimators list.

      • 301217895009.dkr.ecr.us-west-2.amazonaws.com/sagemaker-neo-tensorflow:[tensorflow-version]-[cpu/gpu]-py3

      • 785573368785.dkr.ecr.us-east-1.amazonaws.com/sagemaker-neo-tensorflow:[tensorflow-version]-[cpu/gpu]-py3

      • 007439368137.dkr.ecr.us-east-2.amazonaws.com/sagemaker-neo-tensorflow:[tensorflow-version]-[cpu/gpu]-py3

      • 802834080501.dkr.ecr.eu-west-1.amazonaws.com/sagemaker-neo-tensorflow:[tensorflow-version]-[cpu/gpu]-py3

    • MXNet The MXNet version used must be in MXNet SageMaker Estimators list.

      • 301217895009.dkr.ecr.us-west-2.amazonaws.com/sagemaker-neo-mxnet:[mxnet-version]-[cpu/gpu]-py3

      • 785573368785.dkr.ecr.us-east-1.amazonaws.com/sagemaker-neo-mxnet:[mxnet-version]-[cpu/gpu]-py3

      • 007439368137.dkr.ecr.us-east-2.amazonaws.com/sagemaker-neo-mxnet:[mxnet-version]-[cpu/gpu]-py3

      • 802834080501.dkr.ecr.eu-west-1.amazonaws.com/sagemaker-neo-mxnet:[mxnet-version]-[cpu/gpu]-py3

    • Pytorch The Pytorch version used must be in Pytorch SageMaker Estimators list.

      • 301217895009.dkr.ecr.us-west-2.amazonaws.com/sagemaker-neo-pytorch:[pytorch-version]-[cpu/gpu]-py3

      • 785573368785.dkr.ecr.us-east-1.amazonaws.com/sagemaker-neo-pytorch:[pytorch-version]-[cpu/gpu]-py3

      • 007439368137.dkr.ecr.us-east-2.amazonaws.com/sagemaker-neo-pytorch:[pytorch-version]-[cpu/gpu]-py3

      • 802834080501.dkr.ecr.eu-west-1.amazonaws.com/sagemaker-neo-pytorch:[pytorch-version]-[cpu/gpu]-py3

  • Location of model artifact: the full S3 path of the compiled model artifact generated by the Neo compilation API.

  • Environmental variables:

    • Omit this field for SageMaker Image Classification and SageMaker XGBoost.

    • For TensorFlow, Pytorch, and MXNet, specify the environment variable SAGEMAKER_SUBMIT_DIRECTORY as the full S3 path that contains the training script.

The script must be packaged as a *.tar.gz file. The *.tar.gz file must contain the training script at the root level. The script must contain two additional functions for Neo serving containers:

  • neo_preprocess(payload, content_type): Function that takes in the payload and Content-Type of each incoming request and returns a NumPy array.

  • neo_postprocess(result): Function that takes the prediction results produced by Deep Learning Runtime and returns the response body.

Neither of these two functions use any functionalities of MXNet, Pytorch, or Tensorflow. See the Amazon SageMaker Neo Sample Notebooks for examples using these functions.

Confirm that the information for the containers is accurate, and then choose Create model.This takes you to the create model landing page. Select the Create endpoint button there.


                        Create Model Landing Page

In Create and configure endpoint diagram, specify the Endpoint name. Choose Create a new endpoint configuration in Attach endpoint configuration.


                        Neo console Create and configure endpoint UI.

In New endpoint configuration page, specify the Endpoint configuration name.


                        Neo console New endpoint configuration UI.

Then press Edit next to the name of the model and specify the correct Instance type on the Edit Production Variant page. It is imperative that the Instance type value match the one specified in your compilation job.


                        Neo console New endpoint configuration UI.

When you’re done click Save, then click Create endpoint configuration on the New endpoint configuration page, and then click Create endpoint.