Case study: Retail demand forecasting problem for an e-commerce business
To illustrate forecasting concepts in more detail, consider the case of an e-commerce business that sells products online. Optimizing decisions in the supply chain (for example, in- stock management) is critical to the core competitiveness of this business because it helps having the accurate number of products in the appropriate fulfillment locations. This essentially means having a large selection available with shorter shipping times and competitive prices, which leads to higher customer satisfaction. The key input into the supply chain software system is a prediction of demand or the forecast of potential sales of every product in the catalog. This forecast enables important downstream decisions, key among them being:
-
Macro-level planning (strategic forecasting): For a business as a whole, what is the projected growth in terms of total sales/revenue? Where should the business be (more) active geographically? How should labor be staffed?
-
Demand (or inventory) forecasting: How many units of each product are expected to be sold per location?
-
Promotional activity (tactical forecasting): How should promotions be run? Should products be liquidated?
The rest of the case study focuses on the second problem, which is part of the family of operational forecasting problems (Januschowski & Kolassa, 2019). This document follows the main concerns: data, models (predictors), inferences (forecasts), and productionization.
For this case study, it is important to keep in mind that the forecasting problem is a means to an end. Although the forecasts are crucially important for the business, the downstream supply chain decisions are what is even more important. In our case study, these decisions are taken by automated buying systems that rely on mathematical optimization models from operations research. These systems try to minimize the expected cost for the business.
The key word is expected, meaning that the forecasts ought to cover not only one possible future but all possible futures, with the appropriate weighting according to the probability of a particular outcome. To this end, the key enabler for downstream decision making is a full distribution of the forecast values rather than just having a point forecast. The following figure shows a probabilistic forecast (also referred to as a density forecast). Note that you can derive a single point forecast (the most likely future) easily from this probabilistic forecast, but going from a point forecast to a probabilistic forecast is more difficult.
Given a probabilistic forecast, you can obtain different statistics from it and tailor the results to assist in the decision you want to take. The e-commerce business may have a number of key products for which they almost never want to be out of stock. In this case, use a high quantile (for example, the 90th percentile), which would translate to 90% of the time the products are going to be in-stock. For other products, such as products for which replacements are more easily found (such as pencils), using a lower percentile may be more appropriate.
In Amazon Forecast, you can obtain different quantiles from the probabilistic forecast easily.
In the preceding figure, the black line is the actual values; the dark green line is the median of the forecast distribution; the dark green shaded area is the prediction interval into which you expect 50% of the values to fall; and the light green area is the prediction interval into which you expect 90% of the actual values would fall.
The following sections cover the steps involved for solving the forecasting problem for this business, including: