Autopilot model deployment and prediction

This Amazon SageMaker Autopilot guide includes steps for model deployment, setting up real-time inference, and running inference with batch jobs.

After you train your Autopilot models, you can deploy them to get predictions in one of two ways:

Use Deploy models for real-time inference to set up an endpoint and obtain predictions interactively. Real-time inference is ideal for inference workloads where you have real-time, interactive, low latency requirements.
Use Run batch inference jobs to make predictions in parallel on batches of observations on an entire dataset. Batch inference is a good option for large datasets or if you don't need an immediate response to a model prediction request.

Note

To avoid incurring unnecessary charges: After the endpoints and resources that were created from model deployment are no longer needed, you can delete them. For information about pricing of instances by Region, see Amazon SageMaker Pricing.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Metrics and validation

Deploy models for real-time inference