Laplace distribution as a mixture of normals

Previously on this blog we showed that the t-distribution can be expressed as a continuous mixture of normal distributions. Today, I learned from this paper that the Laplace distribution can be viewed as a continuous mixture of normal distributions as well.

The Laplace distribution with mean \mu \in \mathbb{R} and scale b > 0 has the probability density function

\begin{aligned} f(x) = \frac{1}{2b} \exp \left(-\frac{|x-\mu|}{b} \right). \end{aligned}

(The Laplace distribution is sometimes known as the double exponential distribution, since each tail corresponds to that for an exponential random variable.) Note that if we can show that the Laplace distribution with mean 0 is a mixture of normals, then by shifting all these normals by \mu, it follows that the Laplace distribution with mean \mu is also a mixture of normals.

Fix b > 0. Let W \sim \text{Exp}(\frac{1}{2b^2}), i.e. f_W(w) = \dfrac{1}{2b^2} \exp \left(-\dfrac{x}{2b^2} \right) for w \geq 0, and let X \mid W = w \sim \mathcal{N}(0, w). We claim that X has the Laplace distribution with mean 0 and scale b. This is equivalent to showing that for any x,

\begin{aligned} f_X(x) &= \frac{1}{2b} \exp \left(-\frac{|x|}{b} \right), \\  \Leftrightarrow \quad \int_0^\infty f_{X \mid W = w}(x) f_W(w) dw &= \frac{1}{2b} \exp \left(-\frac{|x|}{b} \right). \end{aligned}

The key to showing this identity is noticing that the integrand on the LHS looks very much like the probability density function (PDF) of the inverse Gaussian distribution. Letting \lambda = |x|^2, \mu = |x|b and Z be an inverse Gaussian random variable with mean \mu and shape \lambda,

\begin{aligned} \int_0^\infty f_{X \mid W = w}(x) f_W(w) dw &= \int_0^\infty \frac{1}{\sqrt{2\pi w}} \exp \left( - \frac{x^2}{2w} \right) \frac{1}{2b^2} \exp \left( - \frac{w}{2b^2} \right) dw \\  &= \frac{1}{2b^2} \int_0^\infty \frac{1}{\sqrt{2\pi w}} \exp \left( - \frac{w^2 + |x|^2b^2}{2b^2 w} \right) dw \\  &= \frac{1}{2b^2} \int_0^\infty \frac{1}{\sqrt{2\pi w}} \exp \left( - \frac{(w - |x|b)^2}{2b^2 w} - \frac{2w|x|b}{2b^2w} \right) dw \\  &= \frac{1}{2b^2} e^{-|x|/b} \int_0^\infty \frac{1}{\sqrt{2\pi w}} \exp \left( - \frac{|x|^2(w - |x|b)^2}{2|x|^2b^2 w} \right) dw \\  &= \frac{1}{2b^2} e^{-|x|/b} \frac{1}{\sqrt{\lambda}} \int_0^\infty w \frac{\sqrt{\lambda}}{\sqrt{2\pi w^3}} \exp \left( - \frac{\lambda(w - \mu)^2}{2\mu^2 w} \right) dw \\  &= \frac{1}{2b^2} e^{-|x|/b} \frac{1}{|x|} \mathbb{E}[Z] \\  &= \frac{1}{2|x|b^2} e^{-|x|/b} \mu \\  &= \frac{1}{2b} e^{-|x|/b}. \end{aligned}

Negative binomial distribution as a mixture of Poissons

Let’s say we are running a series of experiments which are independent of each other, and each experiment has the same success probability p. Let r > 0 be some positive integer indicating the number of failures at which we stop running experiments (i.e. we stop when we hit our rth failure). When we stop, let X denote the number of successes that we had in total. X is said to have negative binomial distribution, which we denote by \text{NegBin}(r, p). For k = 0, 1, \dots, the PMF of X is given by

\mathbb{P}(X = k) = \binom{k+r-1}{k}(1-p)^r p^k = \dfrac{\Gamma (r + k)}{\Gamma (k + 1) \Gamma (r)} (1-p)^r p^k.

Just as the t distribution can be viewed as a mixture of normal distributions, the negative binomial distribution can be viewed as a (continuous) mixture of Poisson distributions. Here is the statement: Let \lambda have gamma distribution with shape r and scale \dfrac{p}{1-p}, and let X \mid \lambda \sim \text{Pois}(\lambda). Then X \sim \text{NegBin}(r,p).

The proof follows from a direct computation of the PMF. For any k = 0, 1, \dots,

\begin{aligned} \mathbb{P}(X = k) &= \int_0^\infty \mathbb{P}(X = k \mid \lambda = t) f_\lambda(t) dt \\  &= \int_0^\infty \frac{t^k e^{-t}}{k!} \cdot t^{r-1} \frac{e^{-t(1-p)/p}}{\left( \frac{p}{1-p}\right)^r \Gamma (r)} dt \\  &= \left( \frac{1-p}{p} \right)^r \frac{1}{k! \Gamma (r)} \int_0^\infty t^{r + k - 1} e^{-t / p} dt.  \end{aligned}

Note that the integrand above is a scaled version of the gamma distribution PDF with shape parameter r + k and scale parameter p. Thus, we can compute the integral exactly to get

\begin{aligned} \mathbb{P}(X = k) &= \left( \frac{1-p}{p} \right)^r \frac{1}{k! \Gamma (r)} \cdot \Gamma (r + k) p^{r+k} \\  &= \frac{\Gamma (r + k)}{\Gamma (k+1)\Gamma (r)} (1-p)^r p^k, \end{aligned}

as required.

Sources for the information above:

  1. Gamma-Poisson mixture, Wikipedia.

t distribution as a mixture of normals

In class, the t distribution is usually introduced like this: if X \sim \mathcal{N}(0,1) and Z \sim \chi_\nu^2 are independent, then T = \dfrac{X}{\sqrt{Z / \nu}} has t distribution with \nu degrees of freedom, denoted t_\nu or t_{(\nu)}.

Did you know that the t distribution can also be viewed as a (continuous) mixture of normal random variables? Specifically, let W have inverse-gamma distribution \text{InvGam}\left(\dfrac{\nu}{2}, \dfrac{\nu}{2} \right), and define the conditional distribution X \mid W = w \sim \mathcal{N}(0, w). Then the unconditional distribution of X is the t distribution with \nu degrees of freedom.

The proof follows directly from computing the unconditional (or marginal) density of X:

\begin{aligned} f_X(x) &= \int_0^\infty f_{X \mid W}(x) f_W(w) dw \\  &\propto \int_0^\infty \frac{1}{\sqrt{w}} \exp \left( -\frac{x^2}{2w} \right) \cdot w^{-\nu/2 - 1} \exp \left( - \frac{\nu}{2w} \right) \\  &= \int_0^\infty w^{-\frac{\nu + 1}{2} - 1} \exp \left( - \frac{x^2 + \nu}{2w} \right) dw. \end{aligned}

Note that the integrand above is proportional to the PDF of the inverse-gamma distribution with \alpha = \dfrac{\nu + 1}{2} and \beta = \dfrac{x^2 + \nu}{2}. Hence, we can evaluate the last integral exactly to get

f_X(x) \propto \Gamma \left( \dfrac{\nu + 1}{2} \right) \left(\dfrac{x^2 + \nu}{2}\right)^{-\frac{\nu + 1}{2}} \propto\left( x^2 + \nu \right)^{-\frac{\nu + 1}{2}} \propto \left( 1 + \dfrac{x^2}{\nu} \right)^{-\frac{\nu + 1}{2}},

which is proportional to the PDF of the  t_\nu distribution.

Sources for the information above:

  1. Student-t as a mixture of normals, John D. Cook Consulting.