LossNotDecreasing Rule - Amazon SageMaker

LossNotDecreasing Rule

This rule detects when the loss is not decreasing in value at an adequate rate. These losses must be scalars.

This rule can be applied either to one of the supported deep learning frameworks (TensorFlow, MXNet, and PyTorch) or to the XGBoost algorithm. You must specify either the collection_names or tensor_regex parameter. If both the parameters are specified, the rule inspects the union of tensors from both sets.

For an example of how to configure and deploy a built-in rule, see How to Use Built-in Rules for Model Analysis.

Parameter Descriptions for the LossNotDecreasing Rule
Parameter Name Description

The trial run using this rule. The rule inspects the tensors gathered from this trial.


Valid values: String


The list of collection names whose tensors the rule inspects.


Valid values: List of strings or a comma-separated string

Default value: None


A list of regex patterns that is used to restrict this comparison to specific scalar-valued tensors. The rule inspects only the tensors that match the regex patterns specified in the list. If no patterns are passed, the rule compares all tensors gathered in the trials by default. Only scalar-valued tensors can be matched.


Valid values: List of strings or a comma-separated string

Default value: None


If set to True, looks for losses in the collection named "losses" when the collection is present.


Valid values: Boolean

Default value: True


The minimum number of steps after which the rule checks if the loss has decreased. Rule evaluation happens every num_steps. The rule compares the loss for this step with the loss at a step which is at least num_steps behind the current step. For example, suppose that the loss is being saved every 3 steps, but num_steps is set to 10. At step 21, loss for step 21 is compared with the loss for step 9. The next step where loss is checked is 33, because 10 steps after 21 is 31, and at 31 and 32 loss is not being saved.


Valid values: Integer

Default value: 10


The minimum percentage difference by which the loss should decrease between num_steps.


Valid values: 0.0 < float < 100

Default value: Checks if the loss is decreasing between num_steps.


The name of the Debugger mode to query tensor values for rule checking. If this is not passed, the rule checks in order by default for the mode.EVAL, then mode.TRAIN, and then mode.GLOBAL.


Valid values: String (EVAL, TRAIN, OR GLOBAL)

Default value: None