An information theoretic approach to uncertainty - AWS Prescriptive Guidance

An information theoretic approach to uncertainty

The explanation of uncertainty in the previous section relies only on the variance notion of uncertainty, but information theoretic notions of uncertainty exist, too. Incorporating information theoretic aleatoric uncertainty improves robustness of the total uncertainty estimate (Gal 2016, Hein, Andriushchenko, and Bitterwolf 2019, van Amersfoort et al. 2020). Total uncertainty is measured by Shannon’s entropy:

Shannon's entropy

where is the dot product operator and is the number of classes.

The predictive entropy is available to both Bayesian and non-Bayesian neural networks. In order to decompose this total uncertainty into the epistemic and aleatoric components, you must estimate the mutual information , and this requires a Bayesian approach.

Mutual information