DeepAR Hyperparameters - Amazon SageMaker

DeepAR Hyperparameters

Parameter Name	Description
`context_length`	The number of time-points that the model gets to see before making the prediction. The value for this parameter should be about the same as the `prediction_length`. The model also receives lagged inputs from the target, so `context_length` can be much smaller than typical seasonalities. For example, a daily time series can have yearly seasonality. The model automatically includes a lag of one year, so the context length can be shorter than a year. The lag values that the model picks depend on the frequency of the time series. For example, lag values for daily frequency are previous week, 2 weeks, 3 weeks, 4 weeks, and year. Required Valid values: Positive integer
`epochs`	The maximum number of passes over the training data. The optimal value depends on your data size and learning rate. See also `early_stopping_patience`. Typical values range from 10 to 1000. Required Valid values: Positive integer
`prediction_length`	The number of time-steps that the model is trained to predict, also called the forecast horizon. The trained model always generates forecasts with this length. It can't generate longer forecasts. The `prediction_length` is fixed when a model is trained and it cannot be changed later. Required Valid values: Positive integer
`time_freq`	The granularity of the time series in the dataset. Use `time_freq` to select appropriate date features and lags. The model supports the following basic frequencies. It also supports multiples of these basic frequencies. For example, `5min` specifies a frequency of 5 minutes. M: monthly W: weekly D: daily H: hourly min: every minute Required Valid values: An integer followed by M, W, D, H, or min. For example, `5min`.
`cardinality`	When using the categorical features (`cat`), `cardinality` is an array specifying the number of categories (groups) per categorical feature. Set this to `auto` to infer the cardinality from the data. The `auto` mode also works when no categorical features are used in the dataset. This is the recommended setting for the parameter. Set cardinality to `ignore` to force DeepAR to not use categorical features, even it they are present in the data. To perform additional data validation, it is possible to explicitly set this parameter to the actual value. For example, if two categorical features are provided where the first has 2 and the other has 3 possible values, set this to [2, 3]. For more information on how to use categorical feature, see the data-section on the main documentation page of DeepAR. Optional Valid values: `auto`, `ignore`, array of positive integers, empty string, or Default value: `auto`
`dropout_rate`	The dropout rate to use during training. The model uses zoneout regularization. For each iteration, a random subset of hidden neurons are not updated. Typical values are less than 0.2. Optional Valid values: float Default value: 0.1
`early_stopping_patience`	If this parameter is set, training stops when no progress is made within the specified number of `epochs`. The model that has the lowest loss is returned as the final model. Optional Valid values: integer
`embedding_dimension`	Size of embedding vector learned per categorical feature (same value is used for all categorical features). The DeepAR model can learn group-level time series patterns when a categorical grouping feature is provided. To do this, the model learns an embedding vector of size `embedding_dimension` for each group, capturing the common properties of all time series in the group. A larger `embedding_dimension` allows the model to capture more complex patterns. However, because increasing the `embedding_dimension` increases the number of parameters in the model, more training data is required to accurately learn these parameters. Typical values for this parameter are between 10-100. Optional Valid values: positive integer Default value: 10
`learning_rate`	The learning rate used in training. Typical values range from 1e-4 to 1e-1. Optional Valid values: float Default value: 1e-3
`likelihood`	The model generates a probabilistic forecast, and can provide quantiles of the distribution and return samples. Depending on your data, select an appropriate likelihood (noise model) that is used for uncertainty estimates. The following likelihoods can be selected: gaussian: Use for real-valued data. beta: Use for real-valued targets between 0 and 1 inclusive. negative-binomial: Use for count data (non-negative integers). student-T: An alternative for real-valued data that works well for bursty data. deterministic-L1: A loss function that does not estimate uncertainty and only learns a point forecast. Optional Valid values: One of gaussian, beta, negative-binomial, student-T, or deterministic-L1. Default value: `student-T`
`mini_batch_size`	The size of mini-batches used during training. Typical values range from 32 to 512. Optional Valid values: positive integer Default value: 128
`num_cells`	The number of cells to use in each hidden layer of the RNN. Typical values range from 30 to 100. Optional Valid values: positive integer Default value: 40
`num_dynamic_feat`	The number of `dynamic_feat` provided in the data. Set this to `auto` to infer the number of dynamic features from the data. The `auto` mode also works when no dynamic features are used in the dataset. This is the recommended setting for the parameter. To force DeepAR to not use dynamic features, even it they are present in the data, set `num_dynamic_feat` to `ignore`. To perform additional data validation, it is possible to explicitly set this parameter to the actual integer value. For example, if two dynamic features are provided, set this to 2. Optional Valid values: `auto`, `ignore`, positive integer, or empty string Default value: `auto`
`num_eval_samples`	The number of samples that are used per time-series when calculating test accuracy metrics. This parameter does not have any influence on the training or the final model. In particular, the model can be queried with a different number of samples. This parameter only affects the reported accuracy scores on the test channel after training. Smaller values result in faster evaluation, but then the evaluation scores are typically worse and more uncertain. When evaluating with higher quantiles, for example 0.95, it may be important to increase the number of evaluation samples. Optional Valid values: integer Default value: 100
`num_layers`	The number of hidden layers in the RNN. Typical values range from 1 to 4. Optional Valid values: positive integer Default value: 2
`test_quantiles`	Quantiles for which to calculate quantile loss on the test channel. Optional Valid values: array of floats Default value: [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

How DeepAR Works

Model Tuning