Amazon Forecast is no longer available to new customers. Existing customers of
Amazon Forecast can continue to use the service as normal.
Learn more"
Replacement Dataset
A replacement dataset is a modified version of the baseline related time series that contains only the values that you want to change in a what-if forecast. The replacement dataset must contain the forecast dimensions, item identifiers, and timestamps in the baseline related time series, as well as at least 1 changed time series. This dataset is merged with the baseline related time series to create a transformed dataset that is used for the what-if forecast. The replacement dataset must be in CSV format.
This dataset should not contain duplicate timestamps for the same time series.
What follows are several examples of how you can specify a replacement time series and how those specifications are interpreted. Consider the case where you are forecasting daily and the forecast horizon is 2022-08-01 through 2022-08-03. The baseline related time series for all examples is given in the following table.
item_id | timestamp | price | stock_count |
---|---|---|---|
item_1 |
2022-08-01 |
100 |
50 |
item_1 |
2022-08-02 |
100 |
50 |
item_1 |
2022-08-03 |
100 |
50 |
item_2 |
2022-08-01 |
75 |
500 |
item_2 |
2022-08-02 |
75 |
500 |
item_2 |
2022-08-03 |
75 |
500 |
To apply a 10% discount on item_1 for 2022-08-02 and 2022-08-03, it is sufficient to specify the following for the replacement dataset:
item_id | timestamp | price |
---|---|---|
item_1 |
2022-08-02 |
90 |
item_1 |
2022-08-03 |
90 |
However, it's also valid to specify unchanged values in the replacement dataset. When used as replacement datasets, each of the following three tables will yield the same results as the previously provided table.
item_id | timestamp | price | stock_count |
---|---|---|---|
item_1 |
2022-08-02 |
90 |
50 |
item_1 |
2022-08-03 |
90 |
50 |
item_id | timestamp | price |
---|---|---|
item_1 |
2022-08-01 |
100 |
item_1 |
2022-08-02 |
90 |
item_1 |
2022-08-03 |
90 |
item_2 |
2022-08-01 |
75 |
item_2 |
2022-08-02 |
75 |
item_2 |
2022-08-03 |
75 |
item_id | timestamp | price | stock_count |
---|---|---|---|
item_1 |
2022-08-01 |
100 |
50 |
item_1 |
2022-08-02 |
90 |
50 |
item_1 |
2022-08-03 |
90 |
50 |
item_2 |
2022-08-01 |
75 |
500 |
item_2 |
2022-08-02 |
75 |
500 |
item_2 |
2022-08-03 |
75 |
500 |
Forecast dimensions
If you include forecast dimensions in your dataset, then you must include them in the replacement dataset. Consider this baseline related time series:
item_id | store_id | timestamp | price | stock_count |
---|---|---|---|---|
item_1 |
store_1 |
2022-08-01 |
100 |
50 |
item_1 |
store_1 |
2022-08-02 |
100 |
50 |
item_1 |
store_1 |
2022-08-03 |
100 |
50 |
item_1 |
store_2 |
2022-08-01 |
75 |
500 |
item_1 |
store_2 |
2022-08-02 |
75 |
500 |
item_1 |
store_2 |
2022-08-03 |
75 |
500 |
Therefore, the replacement dataset for a 10% discount in all stores on 2022-08-02 would be the following:
item_id | store_id | timestamp | price |
---|---|---|---|
item_1 |
store_1 |
2022-08-02 |
90 |
item_1 |
store_2 |
2022-08-02 |
67.5 |