Amazon SageMaker
Developer Guide

Algorithms Provided by Amazon SageMaker: Common Parameters

The following table lists parameters for each of the algorithms provided by Amazon SageMaker.

Algorithm Name Channel Name Training Image and Inference Image Registry Path Training Input Mode File Type Instance Class
k-means train and (optionally) test

<ecr_path>/kmeans:<tag>

File or Pipe recordIO-protobuf or CSV CPU or GPU (single GPU device on one or more instances)
PCA train and (optionally) test

<ecr_path>/pca:<tag>

File or Pipe recordIO-protobuf or CSV GPU or CPU

LDA

train and (optionally) test

<ecr_path>/lda:<tag>

File or Pipe recordIO-protobuf or CSV CPU (single instance only)
Factorization Machines train and (optionally) test

<ecr_path>/factorization-machines:<tag>

File or Pipe recordIO-protobuf CPU (GPU for dense data)
Linear Learner train and (optionally) validation, test, or both <ecr_path>/linear-learner:<tag> File or Pipe recordIO-protobuf or CSV CPU or GPU
Neural Topic Model train and (optionally) validation, test, or both

<ecr_path>/ntm:<tag>

File or Pipe recordIO-protobuf or CSV GPU or CPU
Random Cut Forest train and (optionally) test

<ecr_path>/randomcutforest:<tag>

File or Pipe recordIO-protobuf or CSV CPU

Seq2Seq Modeling

train, validation, and vocab <ecr_path>/seq2seq:<tag> File recordIO-protobuf GPU (single instance only)
XGBoost train and (optionally) validation

<ecr_path>/xgboost:<tag>

File CSV or LibSVM CPU
Object Detection train and validation, (optionally) train_annotation and validation_annotation

<ecr_path>/object-detection:<tag>

File recordIO or image files (.jpg or .png) GPU
Image Classification train and validation, (optionally) train_lst and validation_lst

<ecr_path>/image-classification:<tag>

File recordIO or image files (.jpg or .png) GPU
DeepAR Forecasting train and (optionally) test

<ecr_path>/forecasting-deepar:<tag>

File JSON Lines or Parquet GPU or CPU
BlazingText train

<ecr_path>/blazingtext:<tag>

File Text file (one sentence per line with with space-separated tokens) GPU (single instance only) or CPU
k-nearest-neighbor (k-NN) train and (optionally) test

<ecr_path>/knn:<tag>

File or Pipe recordIO-protobuf or CSV CPU or GPU (single GPU device on one or more instances)

For the Training Image and Inference Image Registry Path column, use the :1 version tag to ensure that you are using a stable version of the algorithm. You can reliably host a model trained using an image with the :1 tag on an inference image that has the :1 tag. Using the :latest tag in the registry path provides you with the most up-to-date version of the algorithm, but might cause problems with backward compatibility. Avoid using the :latest tag for production purposes.

For the Training Image and Inference Image Registry Path column, depending on algorithm and region use one of the following values for <ecr_path>.

Algorithm Name AWS Region Training Image and Inference Image Registry Path
k-means, PCA, Factorization Machines, Linear Learner, Neural Topic Model, k-nearest-neighbor, and Random Cut Forest us-west-2

174872318107.dkr.ecr.us-west-2.amazonaws.com

us-east-1 382416733822.dkr.ecr.us-east-1.amazonaws.com
us-east-2 404615174143.dkr.ecr.us-east-2.amazonaws.com
ap-northeast-1 351501993468.dkr.ecr.ap-northeast-1.amazonaws.com
ap-northeast-2 835164637446.dkr.ecr.ap-northeast-2.amazonaws.com
ap-southeast-2 712309505854.dkr.ecr.ap-southeast-2.amazonaws.com
eu-central-1 664544806723.dkr.ecr.eu-central-1.amazonaws.com
eu-west-1 438346466558.dkr.ecr.eu-west-1.amazonaws.com
LDA

us-west-2

266724342769.dkr.ecr.us-west-2.amazonaws.com

us-east-1

766337827248.dkr.ecr.us-east-1.amazonaws.com

us-east-2

999911452149.dkr.ecr.us-east-2.amazonaws.com

ap-northeast-1

258307448986.dkr.ecr.ap-northeast-1.amazonaws.com

ap-northeast-2

293181348795.dkr.ecr.ap-northeast-2.amazonaws.com
ap-southeast-2 297031611018.dkr.ecr.ap-southeast-2.amazonaws.com
eu-central-1 353608530281.dkr.ecr.eu-central-1.amazonaws.com

eu-west-1

999678624901.dkr.ecr.eu-west-1.amazonaws.com ​

XGBoost, Image Classification, Seq2Seq, BlazingText, and Object Detection

us-west-2

433757028032.dkr.ecr.us-west-2.amazonaws.com

us-east-1

811284229777.dkr.ecr.us-east-1.amazonaws.com

us-east-2

825641698319.dkr.ecr.us-east-2.amazonaws.com

ap-northeast-1

501404015308.dkr.ecr.ap-northeast-1.amazonaws.com

ap-northeast-2

306986355934.dkr.ecr.ap-northeast-2.amazonaws.com
ap-southeast-2 544295431143.dkr.ecr.ap-southeast-2.amazonaws.com
eu-central-1 813361260812.dkr.ecr.eu-central-1.amazonaws.com

eu-west-1

685385470294.dkr.ecr.eu-west-1.amazonaws.com
DeepAR Forecasting

us-west-2

156387875391.dkr.ecr.us-west-2.amazonaws.com ​

us-east-1

522234722520.dkr.ecr.us-east-1.amazonaws.com

us-east-2

566113047672.dkr.ecr.us-east-2.amazonaws.com

ap-northeast-1 633353088612.dkr.ecr.ap-northeast-1.amazonaws.com
ap-northeast-2 204372634319.dkr.ecr.ap-northeast-2.amazonaws.com
ap-southeast-2 514117268639.dkr.ecr.ap-southeast-2.amazonaws.com
eu-central-1 495149712605.dkr.ecr.eu-central-1.amazonaws.com

eu-west-1

224300973850.dkr.ecr.eu-west-1.amazonaws.com

Use the paths and training input mode as follows:

  • To create a training job (with a request to the CreateTrainingJob API), specify the Docker Registry path and the training input mode for the training image. You create a training job to train a model using a specific dataset.

     

  • To create a model (with a CreateModel request), specify the Docker Registry path for the inference image. Amazon SageMaker launches machine learning compute instances that are based on the endpoint configuration and deploys the model, which includes the artifacts (the result of model training).