Automate model development with Amazon SageMaker Autopilot - Amazon SageMaker

Automate model development with Amazon SageMaker Autopilot

Amazon SageMaker Autopilot is a feature-set that automates key tasks of an automatic machine learning (AutoML) process. It explores your data, selects the algorithms relevant to your problem type, and prepares the data to facilitate model training and tuning. Autopilot applies a cross-validation resampling procedure automatically to all candidate algorithms when appropriate to test their ability to predict data they have not been trained on. It simplifies your machine learning experience by automating these key tasks that constitute an AutoML process. It ranks all of the optimized models tested by their performance. It finds the best performing model that you can deploy at a fraction of the time normally required.

Autopilot also helps explain how models make predictions using a feature attribution approach developed for Amazon SageMaker Clarify. Autopilot automatically generates a report that indicate the importance of each feature for the predictions made by the best candidate. This explainability funtionality can make machine learning models more understandable to AWS customers. The model governance report generated can be used to inform risk and compliance teams and external regulators.

You get full visibility into how the data was wrangled and how the models were selected, trained and tuned for each of the candidates tested. This is provided by notebooks that Autopilot generates for each trial that contain the code used to explore the data and find the best candidate. The notebooks also provide educational tools that enable you to learn about and conduct your own ML experiments. You can learn about the impact of various inputs and trade-offs made in experiments by examining the various data exploration and candidate definition notebooks exposed by Autopilot. You can also conduct further experiments on the higher performing candidates by making your own modifications to the notebooks and rerunning them.

The following graphic outlines the principal tasks of an AutoML process managed by Autopilot.


      Overview of the AutoML process used by Amazon SageMaker Autopilot.

You can use Autopilot in different ways: on autopilot (hence the name) or with various degrees of human guidance, without code through Amazon SageMaker Studio, or with code using one of the AWS SDKs. Autopilot currently supports regression and binary and multiclass classification. It also only supports tabular data formatted in files with comma-separated values.

With Amazon SageMaker, you pay only for what you use. Building, training, and deploying ML models is billed by the second, with no minimum fees and no upfront commitments. For more information about the cost of using SageMaker, see Amazon SageMaker Pricing.