Tutorial for building models with Notebook Instances
This Get Started tutorial walks you through how to create a SageMaker notebook instance, open a Jupyter notebook with a preconfigured kernel with the Conda environment for machine learning, and start a SageMaker session to run an end-to-end ML cycle. You'll learn how to save a dataset to a default Amazon S3 bucket automatically paired with the SageMaker session, submit a training job of an ML model to Amazon EC2, and deploy the trained model for prediction by hosting or batch inferencing through Amazon EC2.
This tutorial explicitly shows a complete ML flow of training the XGBoost model from
the SageMaker built-in model pool. You use the US Adult Census dataset
-
SageMaker XGBoost – The XGBoost
model is adapted to the SageMaker environment and preconfigured as Docker containers. SageMaker provides a suite of built-in algorithms that are prepared for using SageMaker features. To learn more about what ML algorithms are adapted to SageMaker, see Choose an Algorithm and Use Amazon SageMaker Built-in Algorithms. For the SageMaker built-in algorithm API operations, see First-Party Algorithms in the Amazon SageMaker Python SDK . -
Adult Census dataset
– The dataset from the 1994 Census bureau database by Ronny Kohavi and Barry Becker (Data Mining and Visualization, Silicon Graphics). The SageMaker XGBoost model is trained using this dataset to predict if an individual makes over $50,000 a year or less.