Decomposing uncertainty
Bayesian neural networks (BNNs) yield a predictive distribution , which provides a set of different predictions from which you can estimate variance ; that is, total predictive uncertainty . The total predictive uncertainty can be split into these two components of uncertainty by using the law of total variance:
The expected value of a target variable , given input and random parameters that specify a BNN, , is estimated by a BNN with a single forward propagation and denoted as . The variance of the target, given input and random parameters, , is output by the BNN, too, and denoted as . Thus, the total predictive uncertainty is the sum of these two numbers:

The variance about the BNN’s predicted means — the epistemic uncertainty

The average of the BNN’s predicted variance — the aleatoric uncertainty
The following formula demonstrates how to calculate total uncertainty in accordance with (Kendall and Gal 2017). BNNs input , generate a random parameter configuration , and make a single forward propagation through the neural network to output a mean and variance . We denote a random generation, or simulation, by ~. With fixed , you can reiterate this process many times to yield a set:
These many samples provide the necessary statistics to ascertain uncertainties. You can do this by estimating epistemic uncertainty and aleatoric uncertainty separately, and then take their sum, as shown previously in the first equation in this section.