Amazon SageMaker
Developer Guide

Linear Learner Hyperparameters

Parameter Name Description
feature_dim

Number of features in the input data.

Required

Valid values: positive integer

num_classes

The number of classes for the response variable. The classes are assumed to be labeled 0, ..., num_classes - 1.

Required when predictor_type is multiclass_classifier; otherwise ignored.

Valid values: integers from 3 to 1,000,000

predictor_type

Specifies the type of target variable as a binary classification, multiclass classification, or regression.

Required

Valid values: binary_classifier, multiclass_classifier, or regressor

accuracy_top_k

The value of k when computing the Top K Accuracy metric for multiclass classification. An example is scored as correct if the model assigns one of the top k scores to the true label.

Optional

Valid values: positive intgers

Default value: 3

balance_multiclass_weights

Specifies whether to use class weights which give each class equal importance in the loss function. Only used when predictor_type is multiclass_classifier.

Optional

Valid values: true, false

Default value: false

beta_1

Exponential decay rate for first moment estimates. Applies only when the optimizer value is adam.

Optional

Valid values: auto or float between 0 and 1.0

Default value: auto

beta_2

Exponential decay rate for second moment estimates. Applies only when the optimizer value is adam.

Optional

Valid values: auto or float between 0 and 1.0

Default value: auto

bias_lr_mult

Allows a different learning rate for the bias term. The actual learning rate for the bias is learning_rate * bias_lr_mult.

Optional

Valid values: auto or positive float

Default value: auto

bias_wd_mult

Allows different regularization for the bias term. The actual L2 regularization weight for the bias is wd * bias_wd_mult. By default there is no regularization on the bias term.

Optional

Valid values: auto or non-negative float

Default value: auto

binary_classifier_model_selection_criteria

Selects the model evaluation criteria for the validation dataset (or for the training dataset if a validation dataset is not present) when predictor_type is set to binary_classifier. The following criteria are available:

  • accuracy: model with highest accuracy.

  • f_beta: model with highest f1 score. The default is F1.

  • precision_at_target_recall: model with highest precision at a given recall target.

  • recall_at_target_precision: model with highest recall at a given precision target.

  • loss_function: model with lowest value of the loss funtion used in training.

Optional

Valid values: accuracy, f_beta, precision_at_target_recall, recall_at_target_precision, or loss_function

Default value: accuracy

early_stopping_patience

The number of epochs to wait before ending training if no improvement is made in the relevant metric. The metric is the binary_classifier_model_selection_criteria if provided, otherwise the metric is the same as loss. The metric is evaluated on the validation data. If no validation data is provided, the metric is always the same as loss and is evaluated on the training data. To disable early stopping, set early_stopping_patience to a value larger than epochs.

Optional

Valid values: positive integer

Default value: 3

early_stopping_tolerance

Relative tolerance to measure an improvement in loss. If the ratio of the improvement in loss divided by the previous best loss is smaller than this value, early stopping considers the improvement to be zero.

Optional

Valid values: positive float

Default value: 0.001

epochs

Maximum number of passes over the training data.

Optional

Valid values: positive integer

Default value: 15

f_beta

The value of beta to use when calculating F score metrics for binary or multiclass classification. Also used if binary_classifier_model_selection_criteria is f_beta.

Optional

Valid values: positive float

Default value: 1.0

huber_delta

Parameter for Huber loss. During training and metric evaluation, compute L2 loss for errors smaller than delta and L1 loss for errors larger than delta.

Optional

Valid values: positive float

Default value: 1.0

init_bias

Initial weight for bias term.

Optional

Valid values: float

Default value: 0

init_method

Sets the initial distrubution function used for model weights.

  • uniform: uniformly between (-scale, +scale)

  • normal: normal, with mean 0 and sigma

Optional

Valid values: uniform or normal

Default value: uniform

init_scale

Scales an initial uniform distribution for model weights. Only applies when init_method is set to uniform.

Optional

Valid values: positive float

Default value: 0.07

init_sigma

The initial standard deviation for the normal distribution. Applies only when init_method is set to normal.

Optional

Valid values: positive float

Default value: 0.01

l1

The L1 regularization parameter. Set the value to 0 if you do not want to use L1 regularization.

Optional

Valid values: auto or non-negative float

Default value: auto

learning_rate

Step size used by the optimizer for parameter updates.

Optional

Valid values: auto or positive float

Default value: auto, whose value depends on the optimizer chosen.

loss

Specifies the loss function to use.

The loss functions available and their default values depend on the value of predictor_type:

  • If the predictor_type is regressor, the avaiable options are auto, squared_loss, absolute_loss, eps_insensitive_squared_loss, eps_insensitive_absolute_loss, quantile_loss, and huber_loss. The default value for auto is squared_loss.

  • If predictor_type is binary_classifier, the available options are auto,logistic, and hinge_loss. The default value for auto is logistic.

  • If predictor_type is multiclass_classifier, the available options are auto and softmax_loss. The default value for auto is softmax_loss.

Valid values: auto, logistic, squared_loss, absolute_loss, hinge_loss, eps_insensitive_squared_loss, eps_insensitive_absolute_loss, quantile_loss, or huber_loss

Optional

Default value: auto

loss_insensitivity

Parameter for epsilon insensitive loss type. During training and metric evaluation, any error smaller than this is considered to be zero.

Optional

Valid values: positive float

Default value: 0.01

lr_scheduler_factor

For every lr_scheduler_step, the learning rate decreases by this quantity. Applies only when the use_lr_scheduler is set to true.

Optional

Valid values: auto or positive float between 0 and 1

Default value: auto

lr_scheduler_minimum_lr

The learning rate never decreases to a value lower than lr_scheduler_minimum_lr. Applies only when the use_lr_scheduler is set to true.

Optional

Valid values: auto or positive float

Default values: auto

lr_scheduler_step

The number of steps between decreases of the learning rate. Applies only when the use_lr_scheduler is set to true.

Optional

Valid values: auto or positive integer

Default value: auto

margin

Margin for hinge_loss.

Optional

Valid values: positive float

Default value: 1.0

mini_batch_size

Number of observations per mini batch for the data iterator.

Optional

Valid values: positive integer

Default value: 1000

momentum

Momentum parameter of the sgd optimizer.

Optional

Valid values: auto or float between 0 and 1.0

Default value: auto

normalize_data

Normalizes the features before training to have a std_dev of 1.

Optional

Valid values: auto, true, or false

Default value: true

normalize_label

Normalizes label. For regression, the label is normalized. For classification, it is not normalized. If normalize_label is set to true for classification, this parameter is ignored.

Optional

Valid values: auto, true, or false

Default value: auto

num_calibration_samples

Number of observations to use from the validation dataset for model calibration (finding the best threshold).

Optional

Valid values: auto or positive integer

Default value: auto

num_models

Number of models to train in parallel. For the default auto, the algorithm decides the number of parallel models to train. One model is trained according to the given training parameter (regularization, optimizer, loss), and the rest by close parameters.

Optional

Valid values: auto or positive integer

Default values: auto

num_point_for_scaler

Number of data points to use for calcuating normalization or unbiasing of terms.

Optional

Valid values: positive integer

Default value: 10,000

optimizer

The optimizataion algorithm to use.

Optional

Valid values:

  • auto: The default value.

  • sgd: Stochastic gradient descent

  • adam: Adaptive momentum estimation

  • rmsprop: A gradient-based optimization technique due to Geoffrey Hinton that uses a moving average of squared gradients to normalize the gradient.

Default value: auto Default setting for auto is adam.

positive_example_weight_mult

Weight assigned to positive examples when training a binary classifier. The weight of negative examples is fixed at 1. If balanced, then a weight is selected so that errors in classifying negative vs. positive examples have equal impact on the training loss. If auto, the algorithm attempts to select the weight that optimizes performance.

Optional

Valid values: balanced, auto, or a positive float

Default value: 1.0

quantile

Quantile for quantile loss. For quantile q, the model attempts to produce predictions such that true_label < prediction with probability q.

Optional

Valid values: float between 0 and 1

Default value: 0.5

target_precision

Target precision. If binary_classifier_model_selection_criteria is recall_at_target_precision, then precision is held at this value while recall is maximized.

Optional

Valid values: float between 0 and 1.0

Default value: 0.8

target_recall

Target recall. If binary_classifier_model_selection_criteria is precision_at_target_recall, then recall is held at this value while precision is maximized.

Optional

Valid values: float between 0 and 1.0

Default value: 0.8

unbias_data

Unbiases the features before training so the mean is 0. By default data is unbiased if use_bias is set to true.

Optional

Valid values: auto, true, or false

Default value: auto

unbias_label

Unbiases labels before training so the mean is 0. Only done for regrssion if use_bias is set to true.

Optional

Valid values: auto, true, or false

Default value: auto

use_bias

Specifies whether the model should include a bias term, which is the intercept term in the linear equation.

Optional

Valid values: true or false

Default value: true

use_lr_scheduler

If true, uses a scheduler for the learning rate.

Optional

Valid values: true or false

Default value: true

wd

The weight decay parameter, also known as the L2 regularization parameter. Set the value to 0 if you do not want to use L2 regularization.

Optional

Valid values:auto or non-negative float

Default value: auto