Tune an NTM Model

Automatic model tuning, also known as hyperparameter tuning, finds the best version of a model by running many jobs that test a range of hyperparameters on your dataset. You choose the tunable hyperparameters, a range of values for each, and an objective metric. You choose the objective metric from the metrics that the algorithm computes. Automatic model tuning searches the hyperparameters chosen to find the combination of values that result in the model that optimizes the objective metric.

Amazon SageMaker AI NTM is an unsupervised learning algorithm that learns latent representations of large collections of discrete data, such as a corpus of documents. Latent representations use inferred variables that are not directly measured to model the observations in a dataset. Automatic model tuning on NTM helps you find the model that minimizes loss over the training or validation data. Training loss measures how well the model fits the training data. Validation loss measures how well the model can generalize to data that it is not trained on. Low training loss indicates that a model is a good fit to the training data. Low validation loss indicates that a model has not overfit the training data and so should be able to model documents successfully on which is has not been trained. Usually, it's preferable to have both losses be small. However, minimizing training loss too much might result in overfitting and increase validation loss, which would reduce the generality of the model.

For more information about model tuning, see Automatic model tuning with SageMaker AI.

Metrics Computed by the NTM Algorithm

The NTM algorithm reports a single metric that is computed during training: validation:total_loss. The total loss is the sum of the reconstruction loss and Kullback-Leibler divergence. When tuning hyperparameter values, choose this metric as the objective.

Metric Name	Description	Optimization Direction
`validation:total_loss`	Total Loss on validation set	Minimize

Tunable NTM Hyperparameters

You can tune the following hyperparameters for the NTM algorithm. Usually setting low mini_batch_size and small learning_rate values results in lower validation losses, although it might take longer to train. Low validation losses don't necessarily produce more coherent topics as interpreted by humans. The effect of other hyperparameters on training and validation loss can vary from dataset to dataset. To see which values are compatible, see NTM Hyperparameters.

Parameter Name	Parameter Type	Recommended Ranges
`encoder_layers_activation`	CategoricalParameterRanges	['sigmoid', 'tanh', 'relu']
`learning_rate`	ContinuousParameterRange	MinValue: 1e-4, MaxValue: 0.1
`mini_batch_size`	IntegerParameterRanges	MinValue: 16, MaxValue:2048
`optimizer`	CategoricalParameterRanges	['sgd', 'adam', 'adadelta']
`rescale_gradient`	ContinuousParameterRange	MinValue: 0.1, MaxValue: 1.0
`weight_decay`	ContinuousParameterRange	MinValue: 0.0, MaxValue: 1.0

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Hyperparameters

Inference Formats