Semantic Segmentation Hyperparameters - Amazon SageMaker

Semantic Segmentation Hyperparameters

The following tables list the hyperparameters supported by the Amazon SageMaker semantic segmentation algorithm for network architecture, data inputs, and training. You specify Semantic Segmentation for training in the AlgorithmName of the CreateTrainingJob request.

Network Architecture Hyperparameters

Parameter Name Description
backbone

The backbone to use for the algorithm's encoder component.

Optional

Valid values: resnet-50, resnet-101

Default value: resnet-50

use_pretrained_model

Whether a pretrained model is to be used for the backbone.

Optional

Valid values: True, False

Default value: True

algorithm

The algorithm to use for semantic segmentation.

Optional

Valid values:

Default value: fcn

Data Hyperparameters

Parameter Name Description
num_classes

The number of classes to segment.

Required

Valid values: 2 ≤ positive integer ≤ 254

num_training_samples

The number of samples in the training data. The algorithm uses this value to set up the learning rate scheduler.

Required

Valid values: positive integer

base_size

Defines how images are rescaled before cropping. Images are rescaled such that the long size length is set to base_size multiplied by a random number from 0.5 to 2.0, and the short size is computed to preserve the aspect ratio.

Optional

Valid values: positive integer > 16

Default value: 520

crop_size

The image size for input during training. We randomly rescale the input image based on base_size, and then take a random square crop with side length equal to crop_size. The crop_size will be automatically rounded up to multiples of 8.

Optional

Valid values: positive integer > 16

Default value: 240

Training Hyperparameters

Parameter Name Description
early_stopping

Whether to use early stopping logic during training.

Optional

Valid values: True, False

Default value: False

early_stopping_min_epochs

The minimum number of epochs that must be run.

Optional

Valid values: integer

Default value: 5

early_stopping_patience

The number of epochs that meet the tolerance for lower performance before the algorithm enforces an early stop.

Optional

Valid values: integer

Default value: 4

early_stopping_tolerance

If the relative improvement of the score of the training job, the mIOU, is smaller than this value, early stopping considers the epoch as not improved. This is used only when early_stopping = True.

Optional

Valid values: 0 ≤ float ≤ 1

Default value: 0.0

epochs

The number of epochs with which to train.

Optional

Valid values: positive integer

Default value: 10

gamma1

The decay factor for the moving average of the squared gradient for rmsprop. Used only for rmsprop.

Optional

Valid values: 0 ≤ float ≤ 1

Default value: 0.9

gamma2

The momentum factor for rmsprop.

Optional

Valid values: 0 ≤ float ≤ 1

Default value: 0.9

learning_rate

The initial learning rate.

Optional

Valid values: 0 < float ≤ 1

Default value: 0.001

lr_scheduler

The shape of the learning rate schedule that controls its decrease over time.

Optional

Valid values:

  • step: A stepwise decay, where the learning rate is reduced (multiplied) by the lr_scheduler_factor after epochs specified by lr_scheduler_step.

  • poly: A smooth decay using a polynomial function.

  • cosine: A smooth decay using a cosine function.

Default value: poly

lr_scheduler_factor

If lr_scheduler is set to step, the ratio by which to reduce (multipy) the learning_rate after each of the epochs specified by the lr_scheduler_step. Otherwise, ignored.

Optional

Valid values: 0 ≤ float ≤ 1

Default value: 0.1

lr_scheduler_step

A comma delimited list of the epochs after which the learning_rate is reduced (multiplied) by an lr_scheduler_factor. For example, if the value is set to "10, 20", then the learning-rate is reduced by lr_scheduler_factor after the 10th epoch and again by this factor after 20th epoch.

Conditionally Required if lr_scheduler is set to step. Otherwise, ignored.

Valid values: string

Default value: (No default, as the value is required when used.)

mini_batch_size

The batch size for training. Using a large mini_batch_size usually results in faster training, but it might cause you to run out of memory. Memory usage is affected by the values of the mini_batch_size and image_shape parameters, and the backbone architecture.

Optional

Valid values: positive integer

Default value: 16

momentum

The momentum for the sgd optimizer. When you use other optimizers, the semantic segmentation algorithm ignores this parameter.

Optional

Valid values: 0 < float ≤ 1

Default value: 0.9

optimizer

The type of optimizer. For more information about an optimizer, choose the appropriate link:

Optional

Valid values: adam, adagrad, nag, rmsprop, sgd

Default value: sgd

syncbn

If set to True, the batch normalization mean and variance are computed over all the samples processed across the GPUs.

Optional

Valid values: True, False

Default value: False

validation_mini_batch_size

The batch size for validation. A large mini_batch_size usually results in faster training, but it might cause you to run out of memory. Memory usage is affected by the values of the mini_batch_size and image_shape parameters, and the backbone architecture.

  • To score the validation on the entire image without cropping the images, set this parameter to 1. Use this option if you want to measure performance on the entire image as a whole.

    Note

    Setting the validation_mini_batch_size parameter to 1 causes the algorithm to create a new network model for every image. This might slow validation and training.

  • To crop images to the size specified in the crop_size parameter, even during evaluation, set this parameter to a value greater than 1.

Optional

Valid values: positive integer

Default value: 16

weight_decay

The weight decay coefficient for the sgd optimizer. When you use other optimizers, the algorithm ignores this parameter.

Optional

Valid values: 0 < float < 1

Default value: 0.0001