Amazon SageMaker
Developer Guide

Deploy a Model

After you build and train your model, you can deploy it to get predictions in one of two ways:

Prerequisites

These topics assume that you have built and trained a machine learning model and are ready to deploy it. If you are new to Amazon SageMaker and have not completed these prerequisite tasks, work through the steps in the Get Started tutorial to familiarize yourself with an example of how Amazon SageMaker manages the data science process and how it handles model deployment. For more information about building a model, see Build a Model. For information about training a model, see Train a Model.

What do you want to do?

Amazon SageMaker provides features to manage resources and optimize inference performance when deploying machine learning models. For guidance on using inference pipelines, compiling and deploying models with Neo, Elastic Inference, and automatic model scaling, see the following topics.

  • To manage data processing and real-time predictions or to process batch transforms in a pipeline, see Deploy an Inference Pipeline.

  • To train TensorFlow, Apache MXNet, PyTorch, ONNX, and XGBoost models once and optimize them to deploy on ARM, Intel, and Nvidia processors, see Amazon SageMaker Neo.

  • To preprocess entire datasets quickly or to get inferences from a trained model for large datasets when you don't need a persistent endpoint, see Batch Transform.

  • To speed up the throughput and decrease the latency of getting real-time inferences from your deep learning models that are deployed as Amazon SageMaker hosted models using a GPU instance for your endpoint, see Amazon SageMaker Elastic Inference (EI) .

  • To dynamically adjust the number of instances provisioned in response to changes in your workload, see Automatically Scale Amazon SageMaker Models.

Manage Model Deployments

For guidance on managing model deployments, including monitoring, troubleshooting, and best practices, and for information on storage associated with inference hosting instances:

Deploy Your Own Inference Code

For developers that need more advanced guidance on how to run your own inference code:

Guide to Amazon SageMaker

What Is Amazon SageMaker?