Object Detection Hyperparameters
In the CreateTrainingJob
request, you specify the training algorithm
that you want to use. You can also specify algorithm-specific hyperparameters that are
used to help estimate the parameters of the model from a training dataset. The following
table lists the hyperparameters provided by Amazon SageMaker for training the object detection
algorithm. For more information about how object training works, see How Object Detection Works.
Parameter Name | Description |
---|---|
num_classes |
The number of output classes. This parameter defines the dimensions of the network output and is typically set to the number of classes in the dataset. Required Valid values: positive integer |
num_training_samples |
The number of training examples in the input dataset. NoteIf there is a mismatch between this value and the number of
samples in the training set, then the behavior of the
Required Valid values: positive integer |
base_network |
The base network architecture to use. Optional Valid values: 'vgg-16' or 'resnet-50' Default value: 'vgg-16' |
early_stopping |
Optional Valid values: Default value: |
early_stopping_min_epochs |
The minimum number of epochs that must be run before the early
stopping logic can be invoked. It is used only when
Optional Valid values: positive integer Default value: 10 |
early_stopping_patience |
The number of epochs to wait before ending training if no
improvement, as defined by the Optional Valid values: positive integer Default value: 5 |
early_stopping_tolerance |
The tolerance value that the relative improvement in
Optional Valid values: 0 ≤ float ≤ 1 Default value: 0.0 |
image_shape |
The image size for input images. We rescale the input image to a square image with this size. We recommend using 300 and 512 for better performance. Optional Valid values: positive integer ≥300 Default: 300 |
epochs |
The number of training epochs. Optional Valid values: positive integer Default: 30 |
freeze_layer_pattern |
The regular expression (regex) for freezing layers in the base
network. For example, if we set Optional Valid values: string Default: No layers frozen. |
kv_store |
The weight update synchronization mode used for distributed
training. The weights can be updated either synchronously or
asynchronously across machines. Synchronous updates typically
provide better accuracy than asynchronous updates but can be slower.
See the Distributed Training NoteThis parameter is not applicable to single machine training. Optional Valid values:
Default: - |
label_width |
The force padding label width used to sync across training and
validation data. For example, if one image in the data contains at
most 10 objects, and each object's annotation is specified with 5
numbers, [class_id, left, top, width, height], then the
Optional Valid values: Positive integer large enough to accommodate the largest annotation information length in the data. Default: 350 |
learning_rate |
The initial learning rate. Optional Valid values: float in (0, 1] Default: 0.001 |
lr_scheduler_factor |
The ratio to reduce learning rate. Used in conjunction with the
Optional Valid values: float in (0, 1) Default: 0.1 |
lr_scheduler_step |
The epochs at which to reduce the learning rate. The learning rate
is reduced by Optional Valid values: string Default: empty string |
mini_batch_size |
The batch size for training. In a single-machine multi-gpu
setting, each GPU handles
Optional Valid values: positive integer Default: 32 |
momentum |
The momentum for Optional Valid values: float in (0, 1] Default: 0.9 |
nms_threshold |
The non-maximum suppression threshold. Optional Valid values: float in (0, 1] Default: 0.45 |
optimizer |
The optimizer types. For details on optimizer values, see MXNet's
API Optional Valid values: ['sgd', 'adam', 'rmsprop', 'adadelta'] Default: 'sgd' |
overlap_threshold |
The evaluation overlap threshold. Optional Valid values: float in (0, 1] Default: 0.5 |
use_pretrained_model |
Indicates whether to use a pre-trained model for training. If set to 1, then the pre-trained model with corresponding architecture is loaded and used for training. Otherwise, the network is trained from scratch. Optional Valid values: 0 or 1 Default: 1 |
weight_decay |
The weight decay coefficient for Optional Valid values: float in (0, 1) Default: 0.0005 |