Menu
Amazon Machine Learning
Developer Guide (Version Latest)

The Amazon Machine Learning Process

The following table describes how to use the Amazon ML console to perform the ML process outlined in this document.

ML Process

Amazon ML Task

Analyze your data

To analyze your data in Amazon ML, create a datasource and review the data insights page.

Split data into training and evaluation datasources

Amazon ML can split the datasource to use 70% of the data for model training and 30% for evaluating your model's predictive performance.

When you use the Create ML Model wizard with the default settings, Amazon ML splits the data for you.

If you use the Create ML Model wizard with the custom settings, and choose to evaluate the ML model, you will see an option for allowing Amazon ML to split the data for you and run an evaluation on 30% of the data.

Shuffle your training data

When you use the Create ML Model wizard with the default settings, Amazon ML shuffles your data for you. You can also shuffle your data before importing it into Amazon ML.

Process features

The process of putting together training data in an optimal format for learning and generalization is known as feature transformation. When you use the Create ML Model wizard with default settings, Amazon ML suggests feature processing settings for your data.

To specify feature processing settings, use the Create ML Model wizard's Custom option and provide a feature processing recipe.

Train the model

When you use the Create ML Model wizard to create a model in Amazon ML, Amazon ML trains your model.

Select model parameters

In Amazon ML, you can tune four parameters that affect your model's predictive performance: model size, number of passes, type of shuffling, and regularization. You can set these parameters when you use the Create ML Model wizard to create an ML model and choose the Custom option.

Evaluate the model performance

Use the Create Evaluation wizard to assess your model's predictive performance.

Feature selection

The Amazon ML learning algorithm can drop features that don't contribute much to the learning process. To indicate that you want to drop those features, choose the L1 regularization parameter when you create the ML model.

Set a score threshold for prediction accuracy

Review the model's predictive performance in the evaluation report at different score thresholds, and then set the score threshold based on your business application. The score threshold determines how the model defines a prediction match. Adjust the number to control false positives and false negatives.

Use the model

Use your model to get predictions for a batch of observations by using the Create Batch Prediction wizard.

Or, get predictions for individual observations on demand by enabling the ML model to process real-time predictions using the Predict API.