DeepAR Hyperparameters
Parameter Name | Description |
---|---|
context_length |
The number of time-points that the model gets to see before making
the prediction. The value for this parameter should be about the
same as the Required Valid values: Positive integer |
epochs |
The maximum number of passes over the training data. The optimal
value depends on your data size and learning rate. See also
Required Valid values: Positive integer |
prediction_length |
The number of time-steps that the model is trained to predict,
also called the forecast horizon. The trained model always generates
forecasts with this length. It can't generate longer forecasts. The
Required Valid values: Positive integer |
time_freq |
The granularity of the time series in the dataset. Use
Required Valid values: An integer followed by M,
W, D, H,
or min. For example, |
cardinality |
When using the categorical features ( For a fixed array index Optional Valid values: array of positive integers, empty string, or
Default value: |
dropout_rate |
The dropout rate to use during training. The model uses zoneout regularization. For each iteration, a random subset of hidden neurons are not updated. Typical values are less than 0.2. Optional Valid values: float Default value: 0.1 |
early_stopping_patience |
If this parameter is set, training stops when no progress is made
within the specified number of Optional Valid values: integer |
embedding_dimension |
Size of embedding vector learned per categorical feature (same value is used for all categorical features). The DeepAR model can learn group-level time series patterns when a
categorical grouping feature is provided. To do this, the model
learns an embedding vector of size Optional Valid values: positive integer Default value: 10 |
learning_rate |
The learning rate used in training. Typical values range from 1e-4 to 1e-1. Optional Valid values: float Default value: 1e-3 |
likelihood |
The model generates a probabilistic forecast, and can provide quantiles of the distribution and return samples. Depending on your data, select an appropriate likelihood (noise model) that is used for uncertainty estimates. The following likelihoods can be selected:
Optional Valid values: One of gaussian, beta, negative-binomial, student-T, or deterministic-L1. Default value: |
mini_batch_size |
The size of mini-batches used during training. Typical values range from 32 to 512. Optional Valid values: positive integer Default value: 128 |
num_cells |
The number of cells to use in each hidden layer of the RNN. Typical values range from 30 to 100. Optional Valid values: positive integer Default value: 40 |
num_dynamic_feat |
The number of Optional Valid values: positive integer, empty string, or
Default value: |
num_eval_samples |
The number of samples that are used per time-series when calculating test accuracy metrics. This parameter does not have any influence on the training or the final model. In particular, the model can be queried with a different number of samples. This parameter only affects the reported accuracy scores on the test channel after training. Smaller values result in faster evaluation, but then the evaluation scores are typically worse and more uncertain. When evaluating with higher quantiles, for example 0.95, it may be important to increase the number of evaluation samples. Optional Valid values: integer Default value: 100 |
num_layers |
The number of hidden layers in the RNN. Typical values range from 1 to 4. Optional Valid values: positive integer Default value: 2 |
test_quantiles |
Quantiles for which to calculate quantile loss on the test channel. Optional Valid values: array of floats Default value: [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9] |