Amazon SageMaker
Developer Guide

The AWS Documentation website is getting a new look!
Try it now and let us know what you think. Switch to the new look >>

You can return to the original look by selecting English in the language selector above.

Create an Algorithm Resource

To create an algorithm resource that you can use to run training jobs in Amazon SageMaker and publish on AWS Marketplace specify the following information:

  • The Docker containers that contains the training and, optionally, inference code.

  • The configuration of the input data that your algorithm expects for training.

  • The hyperparameters that your algorithm supports.

  • Metrics that your algorithm sends to Amazon CloudWatch during training jobs.

  • The instance types that your algorithm supports for training and inference, and whether it supports distributed training across multiple instances.

  • Validation profiles, which are training jobs that Amazon SageMaker uses to test your algorithm's training code and batch transform jobs that Amazon SageMaker runs to test your algorithm's inference code.

    To ensure that buyers and sellers can be confident that products work in Amazon SageMaker, we require that you validate your algorithms before listing them on AWS Marketplace. You can list products in the AWS Marketplace only if validation succeeds. To validate your algorithms, Amazon SageMaker uses your validation profile and sample data to run the following validations tasks:

    1. Create a training job in your account to verify that your training image works with Amazon SageMaker.

    2. If you included inference code in your algorithm, create a model in your account using the algorithm's inference image and the model artifacts produced by the training job.

    3. If you included inference code in your algorithm, create a transform job in your account using the model to verify that your inference image works with Amazon SageMaker.

    When you list your product on AWS Marketplace, the inputs and outputs of this validation process persist as part of your product and are made available to your buyers. This helps buyers understand and evaluate the product before they buy it. For example, buyers can inspect the input data that you used, the outputs generated, and the logs and metrics emitted by your code. The more comprehensive your validation specification, the easier it is for customers to evaluate your product.

    Note

    In your validation profile, provide only data that you want to expose publicly.

    Validation can take up to a few hours. To see the status of the jobs in your account, in the Amazon SageMaker console, see the Training jobs and Transform jobs pages. If validation fails, you can access the scan and validation reports from the Amazon SageMaker console. If any issues are found, you will have to create the algorithm again.

    Note

    To publish your algorithm on AWS Marketplace, at least one validation profile is required.

You can create an algorithm by using either the Amazon SageMaker console or the Amazon SageMaker API.

Create an Algorithm Resource (Console)

To create an algorithm resource (console)

  1. Open the Amazon SageMaker console at https://console.aws.amazon.com/sagemaker/.

  2. Choose Algorithms, then choose Create algorithm.

  3. On the Training specifications page, provide the following information:

    1. For Algorithm name, type a name for your algorithm. The algorithm name must be unique in your account and in the AWS region. The name must have 1 to 64 characters. Valid characters are a-z, A-Z, 0-9, and - (hyphen).

    2. Type a description for your algorithm. This description appears in the Amazon SageMaker console and in the AWS Marketplace.

    3. For Training image, type the path in Amazon ECR where your training container is stored.

    4. For Support distributed training, Choose Yes if your algorithm supports training on multiple instances. Otherwise, choose No.

    5. For Support instance types for training, choose the instance types that your algorithm supports.

    6. For Channel specification, specify up to 8 channels of input data for your algorithm. For example, you might specify 3 input channels named train, validation, and test. For each channel, specify the following information:

      1. For Channel name, type a name for the channel. The name must have 1 to 64 characters. Valid characters are a-z, A-Z, 0-9, and - (hyphen).

      2. To require the channel for your algorithm, choose Channel required.

      3. Type a description for the channel.

      4. For Supported input modes, choose Pipe mode if your algorithm supports streaming the input data, and File mode if your algorithm supports downloading the input data as a file. You can choose both.

      5. For Supported content types, type the MIME type that your algorithm expects for input data.

      6. For Supported compression type, choose Gzip if your algorithm supports Gzip compression. Otherwise, choose None.

      7. Choose Add channel to add another data input channel, or choose Next if you are done adding channels.

  4. On the Tuning specifications page, provide the following information:

    1. For Hyperparameter specification, specify the hyperparameters that your algorithm supports by editing the JSON object. For each hyperparameter that your algorithm supports, construct a JSON block similar to the following:

      { "DefaultValue": "5", "Description": "The first hyperparameter", "IsRequired": true, "IsTunable": false, "Name": "intRange", "Range": { "IntegerParameterRangeSpecification": { "MaxValue": "10", "MinValue": "1" } }, "Type": "Integer"

      In the JSON, supply the following:

      1. For DefaultValue, specify a default value for the hyperparameter, if there is one.

      2. For Description, specify a description for the hyperparameter.

      3. For IsRequired, specify whether the hyperparameter is required.

      4. For IsTunable, specify true if this hyperparameter can be tuned when a user runs a hyperparameter tuning job that uses this algorithm. For information, see Automatic Model Tuning.

      5. For Name, specify a name for the hyperparameter.

      6. For Range, specify one of the following:

        • IntegerParameterRangeSpecification - the values of the hyperparameter are integers. Specify minimum and maximum values for the hyperparameter.

        • ContinuousParameterRangeSpecification - the values of the hyperparameter are floating-point values. Specify minimum and maximum values for the hyperparameter.

        • CategoricalParameterRangeSpecification - the values of the hyperparameter are categorical values. Specify a list of all of the possible values.

      7. For Type, specify Integer, Continuous, or Categorical. The value must correspond to the type of Range that you specified.

    2. For Metric definitions, specify any training metrics that you want your algorithm to emit. Amazon SageMaker uses the regular expression that you specify to find the metrics by parsing the logs from your training container during training. Users can view these metrics when they run training jobs with your algorithm, and they can monitor and plot the metrics in Amazon CloudWatch. For information, see Monitor and Analyze Training Jobs Using Metrics. For each metric, provide the following information:

      1. For Metric name, type a name for the metric.

      2. For Regex, type the regular expression that Amazon SageMaker uses to parse training logs so that it can find the metric value.

      3. For Objective metric support choose Yes if this metric can be used as the objective metric for a hyperparameter tuning job. For information, see Automatic Model Tuning.

      4. Choose Add metric to add another metric, or choose Next if you are done adding metrics.

  5. On the Inference specifications page, provide the following information if your algorithm supports inference:

    1. For Container definition, type path in Amazon ECR where your inference container is stored.

    2. For Container DNS host name, type the name of a DNS host for your image.

    3. For Supported instance types for real-time inference, choose the instance types that your algorithm supports for models deployed as hosted endpoints in Amazon SageMaker. For information, see Deploy a Model on Amazon SageMaker Hosting Services.

    4. For Supported instance types for batch transform jobs, choose the instance types that your algorithm supports for batch transform jobs. For information, see Get Inferences for an Entire Dataset with Batch Transform.

    5. For Supported content types, type the type of input data that your algorithm expects for inference requests.

    6. For Supported response MIME types, type the MIME types that your algorithm supports for inference responses.

    7. Choose Next.

  6. On the Validation specifications page, provide the following information:

    1. For Publish this algorithm on AWS Marketplace, choose Yes to publish the algorithm on AWS Marketplace.

    2. For Validate this algorithm, choose Yes if you want Amazon SageMaker to run training jobs and/or batch transform jobs that you specify to test the training and/or inference code of your algorithm.

      Note

      To publish your algorithm on AWS Marketplace, your algorithm must be validated.

    3. For IAM role, choose an IAM role that has the required permissions to run training jobs and batch transform jobs in Amazon SageMaker, or choose Create a new role to allow Amazon SageMaker to create a role that has the AmazonSageMakerFullAccess managed policy attached. For information, see Amazon SageMaker Roles .

    4. For Validation profile, specify the following:

      • A name for the validation profile.

      • A Training job definition. This is a JSON block that describes a training job. This is in the same format as the TrainingJobDefinition input parameter of the CreateAlgorithm API.

      • A Transform job definition. This is a JSON block that describes a batch transform job. This is in the same format as the TransformJobDefinitioninput parameter of the CreateAlgorithm API.

    5. Choose Create algorithm.

Create an Algorithm Resource (API)

To create an algorithm resource by using the Amazon SageMaker API, call the CreateAlgorithm API.