Monte Carlo dropout - AWS Prescriptive Guidance

Monte Carlo dropout

One of the most popular ways to estimate uncertainty is by inferring predictive distributions with Bayesian neural networks. To denote a predictive distribution, use:

Predictive distribution

with target AWS logo with "Amazon Web Services" text on a white background. , input X icon, typically used to represent closing or canceling an action. , and Lambda function icon with a stylized λ (lambda) symbol in orange. many training examples Mathematical formula showing D as a set of pairs (x_i, y_i) from i=1 to n. . When you obtain a predictive distribution, you can inspect the variance and uncover uncertainty. One way to learn a predictive distribution requires learning a distribution over functions, or, equivalently, a distribution over the parameters (that is, the parametric posterior distribution Mathematical formula showing p(Θ|D) with vertical bar between Θ and D. .

The Monte Carlo (MC) dropout technique (Gal and Ghahramani 2016) provides a scalable way to learn a predictive distribution. MC dropout works by randomly switching off neurons in a neural network, which regularizes the network. Each dropout configuration corresponds to a different sample from the approximate parametric posterior distribution Mathematical formula showing q(θ|D) representing a probability distribution. :

MC dropout

where Greek letter theta subscript i, representing a mathematical variable or symbol. corresponds to a dropout configuration, or, equivalently, a simulation ~, sampled from the approximate parametric posterior Mathematical formula showing q(θ|D) representing a probability distribution. , as shown in the following figure. Sampling from the approximate posterior Mathematical formula showing q(θ|D) representing a probability distribution. enables Monte Carlo integration of the model’s likelihood, which uncovers the predictive distribution, as follows:

Predictive distribution in MC dropout

For simplicity, the likelihood may be assumed to be Gaussian distributed:

Gaussian distributed likelihood

with the Gaussian function Mathematical equation showing N subscript V, representing a variable in a formula. specified by the mean Mathematical function f(x, θ) with x and θ as variables. and variance Mathematical formula showing s prime as a function of x and theta. parameters, which are output by simulations from the Monte Carlo dropout BNN:

MC dropout BNN

The following figure illustrates MC dropout. Each dropout configuration yields a different output by randomly switching neurons off (gray circles) and on (black circles) with each forward propagation. Multiple forward passes with different dropout configurations yield a predictive distribution over the mean p(f(x, ø)).

MC dropout

The number of forward passes through the data should be evaluated quantitatively, but 30-100 is an appropriate range to consider (Gal and Ghahramani 2016).