Semantic Segmentation Hyperparameters

The following tables list the hyperparameters supported by the Amazon SageMaker semantic segmentation algorithm for network architecture, data inputs, and training. You specify Semantic Segmentation for training in the AlgorithmName of the CreateTrainingJob request.

Network Architecture Hyperparameters

Parameter Name Description

Parameter Name	Description
`backbone`	The backbone to use for the algorithm's encoder component. Optional Valid values: `resnet-50`, `resnet-101` Default value: `resnet-50`
`use_pretrained_model`	Whether a pretrained model is to be used for the backbone. Optional Valid values: `True`, `False` Default value: `True`
`algorithm`	The algorithm to use for semantic segmentation. Optional Valid values: `fcn`: Fully-Convolutional Network (FCN) algorithm `psp`: Pyramid Scene Parsing (PSP) algorithm `deeplab`: DeepLab V3 algorithm Default value: `fcn`

backbone

The backbone to use for the algorithm's encoder component.

Optional

Valid values: resnet-50, resnet-101

Default value: resnet-50

use_pretrained_model

Whether a pretrained model is to be used for the backbone.

Optional

Valid values: True, False

Default value: True

algorithm

The algorithm to use for semantic segmentation.

Optional

Valid values:

fcn: Fully-Convolutional Network (FCN) algorithm
psp: Pyramid Scene Parsing (PSP) algorithm
deeplab: DeepLab V3 algorithm

Default value: fcn

Data Hyperparameters

Parameter Name	Description
`num_classes`	The number of classes to segment. Required Valid values: 2 ≤ positive integer ≤ 254
`num_training_samples`	The number of samples in the training data. The algorithm uses this value to set up the learning rate scheduler. Required Valid values: positive integer
`base_size`	Defines how images are rescaled before cropping. Images are rescaled such that the long size length is set to `base_size` multiplied by a random number from 0.5 to 2.0, and the short size is computed to preserve the aspect ratio. Optional Valid values: positive integer > 16 Default value: 520
`crop_size`	The image size for input during training. We randomly rescale the input image based on `base_size`, and then take a random square crop with side length equal to `crop_size`. The `crop_size` will be automatically rounded up to multiples of 8. Optional Valid values: positive integer > 16 Default value: 240

Training Hyperparameters

Parameter Name	Description
`early_stopping`	Whether to use early stopping logic during training. Optional Valid values: `True`, `False` Default value: `False`
`early_stopping_min_epochs`	The minimum number of epochs that must be run. Optional Valid values: integer Default value: 5
`early_stopping_patience`	The number of epochs that meet the tolerance for lower performance before the algorithm enforces an early stop. Optional Valid values: integer Default value: 4
`early_stopping_tolerance`	If the relative improvement of the score of the training job, the mIOU, is smaller than this value, early stopping considers the epoch as not improved. This is used only when `early_stopping` = `True`. Optional Valid values: 0 ≤ float ≤ 1 Default value: 0.0
`epochs`	The number of epochs with which to train. Optional Valid values: positive integer Default value: 10
`gamma1`	The decay factor for the moving average of the squared gradient for `rmsprop`. Used only for `rmsprop`. Optional Valid values: 0 ≤ float ≤ 1 Default value: 0.9
`gamma2`	The momentum factor for `rmsprop`. Optional Valid values: 0 ≤ float ≤ 1 Default value: 0.9
`learning_rate`	The initial learning rate. Optional Valid values: 0 < float ≤ 1 Default value: 0.001
`lr_scheduler`	The shape of the learning rate schedule that controls its decrease over time. Optional Valid values: `step`: A stepwise decay, where the learning rate is reduced (multiplied) by the `lr_scheduler_factor` after epochs specified by `lr_scheduler_step`. `poly`: A smooth decay using a polynomial function. `cosine`: A smooth decay using a cosine function. Default value: `poly`
`lr_scheduler_factor`	If `lr_scheduler` is set to `step`, the ratio by which to reduce (multipy) the `learning_rate` after each of the epochs specified by the `lr_scheduler_step`. Otherwise, ignored. Optional Valid values: 0 ≤ float ≤ 1 Default value: 0.1
`lr_scheduler_step`	A comma delimited list of the epochs after which the `learning_rate` is reduced (multiplied) by an `lr_scheduler_factor`. For example, if the value is set to `"10, 20"`, then the `learning-rate` is reduced by `lr_scheduler_factor` after the 10th epoch and again by this factor after 20th epoch. Conditionally Required if `lr_scheduler` is set to `step`. Otherwise, ignored. Valid values: string Default value: (No default, as the value is required when used.)
`mini_batch_size`	The batch size for training. Using a large `mini_batch_size` usually results in faster training, but it might cause you to run out of memory. Memory usage is affected by the values of the `mini_batch_size` and `image_shape` parameters, and the backbone architecture. Optional Valid values: positive integer Default value: 16
`momentum`	The momentum for the `sgd` optimizer. When you use other optimizers, the semantic segmentation algorithm ignores this parameter. Optional Valid values: 0 < float ≤ 1 Default value: 0.9
`optimizer`	The type of optimizer. For more information about an optimizer, choose the appropriate link: `adam`: Adaptive momentum estimation `adagrad`: Adaptive gradient descent `nag`: Nesterov accelerated gradient `rmsprop`: Root mean square propagation `sgd`: Stochastic gradient descent Optional Valid values: `adam`, `adagrad`, `nag`, `rmsprop`, `sgd` Default value: `sgd`
`syncbn`	If set to `True`, the batch normalization mean and variance are computed over all the samples processed across the GPUs. Optional Valid values: `True`, `False` Default value: `False`
`validation_mini_batch_size`	The batch size for validation. A large `mini_batch_size` usually results in faster training, but it might cause you to run out of memory. Memory usage is affected by the values of the `mini_batch_size` and `image_shape` parameters, and the backbone architecture. To score the validation on the entire image without cropping the images, set this parameter to 1. Use this option if you want to measure performance on the entire image as a whole. Note Setting the `validation_mini_batch_size` parameter to 1 causes the algorithm to create a new network model for every image. This might slow validation and training. To crop images to the size specified in the `crop_size` parameter, even during evaluation, set this parameter to a value greater than 1. Optional Valid values: positive integer Default value: 16
`weight_decay`	The weight decay coefficient for the `sgd` optimizer. When you use other optimizers, the algorithm ignores this parameter. Optional Valid values: 0 < float < 1 Default value: 0.0001

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Semantic Segmentation

Model Tuning

Semantic Segmentation Hyperparameters

Note