Aleatoric uncertainty
Aleatoric uncertainty refers to the data’s inherent randomness that cannot be explained away (aleator refers to someone who rolls the dice in Latin). Examples of data with aleatoric uncertainty include noisy telemetry data and low-resolution images or social media text. You can assume the aleatoric uncertainty , the inherent randomness, to be either constant (homoscedastic) or variable (heteroscedastic), as a function of the input explanatory variables.
Homoscedastic aleatoric uncertainty
Homoscedastic aleatoric uncertainty, when is constant, is the simplest case and commonly encountered in regression under the modeling assumption that , where , where is the identity matrix and is a constant scalar. It is highly restrictive to assume constant aleatoric risk—to assume that noise about a response is independent from the explanatory variable and constant—and rarely reflective of reality. Many phenomena in nature do not exhibit constant randomness. For example, uncertainty about outcomes in physical systems, such as fluid motion, are usually a function of kinetic energy. Consider the contrast between the turbulent water flow of a large waterfall and the laminar water flow of a decorative fountain. The stochasticity (randomness) of a water particle’s trajectory is a function of the kinetic energy and therefore not constant. This assumption can lead to loss of valuable information when modeling relationships between targets and inputs that host variable noise, and cannot be explained with the observable information. As a consequence, in most cases, it is not sufficient to assume homoscedastic uncertainty. Unless the phenomena is known to be homoscedastic in nature, the inherent noise should be modeled as a function of the explanatory variables , if it can be done so.
Heteroscedastic aleatoric uncertainty
Heteroscedastic aleatoric uncertainty is when we consider the inherent randomness within data to be a function of the data itself . To calculate this type of uncertainty, you average a sample set of the predictive variance:
with being estimated by a BNN. Learning aleatoric uncertainty during training encourages BNNs to encapsulate the inherent randomness within the data that can’t be explained away. If there is no inherent randomness, should tend toward zero.