Amazon SageMaker
Developer Guide

Object2Vec Hyperparameters

Parameter Name Description
enc0_max_seq_len

The maximum sequence length for the enc0 encoder.

Required

Valid values: 1 ≤ integer ≤ 5000

enc0_vocab_size

The vocabulary size of enc0 tokens.

Required

Valid values: 2 ≤ integer ≤ 3000000

bucket_width

The allowed difference between data sequence length when bucketing is enabled. Bucketing is enabled when a non-zero value is specified for this parameter.

Optional

Valid values: 0 ≤ integer ≤ 100

Default value: 0 (no bucketing)

dropout

The dropout probability for network layers. Dropout is a form of regularization used in neural networks that reduces overfitting by trimming codependent neurons.

Optional

Valid values: 0.0 ≤ float ≤ 1.0

Default value: 0.0

early_stopping_patience

The number of consecutive epochs without improvement allowed before early stopping is applied. Improvement is defined by the early_stopping_tolerance.

Optional

Valid values: 1 ≤ integer ≤ 5

Default value: 3

early_stopping_tolerance

The reduction in the loss function that an algorithm must achieve between consecutive epochs to avoid early stopping after an early_stopping_patience number of consecutive epochs.

Optional

Valid values: 0.000001 ≤ float ≤ 0.1

Default value: 0.01

enc_dim

The dimension of the output of the embedding layer.

Optional

Valid values: 4 ≤ integer ≤ 10000

Default value: 4096

enc0_network

Network model for the enc0 encoder.

Optional

Valid values: hcnn, bilstm, or pooled_embedding

  • hcnn: A heterogeneous convolutional neural network

  • bilstm: A bidirectional long short-term memory network (LSTM), in which the signal propagates backward as well as forward in time. This is an appropriate recurrent neural network (RNN) architecture for sequential learning tasks.

  • pooled_embedding: Averages the embeddings of all the tokens in the input.

Default value: hcnn

enc0_cnn_filter_width

The filter width of the convolutional neural network (CNN) enc0 encoder.

Conditional

Valid values: 1 ≤ integer ≤ 9

Default value: 3

enc0_freeze_pretrained_embedding

Whether to freeze enc0 pretrained embedding weights.

Conditional

Valid values: True or False

Default value: True

enc0_layers

The number of layers in the enc0 encoder.

Conditional

Valid values: auto or 1 ≤ integer ≤ 4

Default value: auto

enc0_pretrained_embedding_file

The filename of pretrained enc0 token embedding file in the auxiliary data channel.

Conditional

Valid values: String with alphanumeric characters, underscore, or period. [A-Za-z0-9\.\_]

Default value: "" (empty string)

enc0_token_embedding_dim

The output dimension of the enc0 token embedding layer.

Conditional

Valid values: 2 ≤ integer ≤ 1000

Default value: 300

enc0_vocab_file

The vocabulary file for mapping pretrained enc0 token embeddings to vocabulary IDs.

Conditional

Valid values: String with alphanumeric characters, underscore, or period. [A-Za-z0-9\.\_]

Default value: "" (empty string)

enc1_network

The network model for the enc1 encoder. If its value is set to enc0, then enc1 uses the same network model as enc0, including the hyperparameter values. Note that although the enc0 and enc1 encoder networks may have symmetric architecture, shared parameter values for these networks is not supported.

Optional

Valid values: hcnn, bilstm, or pooled_embedding

  • enc0: Network model for the enc0 encoder.

  • hcnn: A heterogeneous convolutional neural network

  • bilstm: A bidirectional LSTM, in which the signal propagates backward as well as forward in time. This is an appropriate recurrent neural network (RNN) architecture for sequential learning tasks.

  • pooled_embedding: Averages the embeddings of all the tokens in the input.

Default value: enc0

enc1_cnn_filter_width

The filter width of the CNN enc1 encoder.

Conditional

Valid values: 1 ≤ integer ≤ 9

Default value: 3

enc1_freeze_pretrained_embedding

Whether to freeze enc1 pretrained embedding weights.

Conditional

Valid values: True or False

Default value: True

enc1_layers

The number of layers in the enc1 encoder.

Conditional

Valid values: auto or 1 ≤ integer ≤ 4

Default value: auto

enc1_max_seq_len

The maximum sequence length for the enc1 encoder.

Conditional

Valid values: 1 ≤ integer ≤ 5000

enc1_pretrained_embedding_file

The filename of the enc1 pretrained token embedding file in the auxiliary data channel.

Conditional

Valid values: String with alphanumeric characters, underscore, or period. [A-Za-z0-9\.\_]

Default value: "" (empty string)

enc1_token_embedding_dim

The output dimension of enc1 token embedding layer.

Conditional

Valid values: 2 ≤ integer ≤ 1000

Default value: 300

enc1_vocab_file

The vocabulary file for mapping pretrained enc1 token embeddings to vocabulary IDs

Conditional

Valid values: String with alphanumeric characters, underscore, or period. [A-Za-z0-9\.\_]

Default value: "" (empty string)

enc1_vocab_size

The vocabulary size of enc0 tokens.

Conditional

Valid values: 2 ≤ integer ≤ 3000000

epochs

The number of epochs to run for training.

Optional

Valid values: 1 ≤ integer ≤ 100

Default value: 30

learning_rate

The learning rate for training.

Optional

Valid values: 1.0e-6 ≤ float ≤ 1.0

Default value: 0.0004

mini_batch_size

The batch size that the data set is split into for an optimizer during training.

Optional

Valid values: 1 ≤ integer ≤ 10000

Default value: 32

mlp_activation

The type of activation function for the multilayer perceptron (MLP) layer.

Optional

Valid values: tanh, relu, or linear

  • tanh: Hyperbolic tangent

  • relu: Rectified linear unit (ReLU)

  • linear: Linear function

Default value: linear

mlp_dim

The dimension of the output from multilayer perceptron (MLP) layers.

Optional

Valid values: 2 ≤ integer ≤ 10000

Default value: 512

mlp_layers

The number of multilayer perceptron (MLP) layers in the network.

Optional

Valid values: 0 ≤ integer ≤ 10

Default value: 2

num_classes

The number of classes for classification training. Ignored for regression problems.

Optional

Valid values: 2 ≤ integer ≤ 30

Default value: 2

optimizer

The optimizer type.

Optional

Valid values: One of adadelta, adagrad, adam, sgd, or rmsprop.

Default value: sgd

output_layer

The type of output layer.

Optional

Valid values: softmax or mean_squared_error:

  • softmax: The Softmax function used for classification.

  • mean_squared_error: The MSE used for regression.

Default value: softmax

weight_decay

The weight decay parameter used for optimization.

Optional

Valid values: 0 ≤ float ≤ 10000

Default value: 0