Amazon SageMaker
Developer Guide

Explore and Preprocess Data

Before using a dataset to train a model, data scientists typically explore and preprocess it. For example, in one of the exercises in this guide, you use the MNIST dataset, a commonly available dataset of handwritten numbers, for model training. Before you begin training, you transform the data into a format that is more efficient for training. For more information, see Step 3.2.3: Transform the Training Dataset and Upload It to S3.

To preprocess data, use a Jupyter notebook on an Amazon SageMaker notebook instance. You can also use the notebook instance to write code to create model training jobs, deploy models to Amazon SageMaker hosting, and test or validate your models. For more information, see Using Notebook Instances

How It Works: Next Topic

Training a Model with Amazon SageMaker