What is Isserlis’ theorem?

I learnt about Isserlis’ theorem (also known as Wick’s probability theorem) at a talk today. The theorem comes from a paper from 1918, which is listed as Reference 1 below. In the words of Reference 2, the theorem

… allows [us] express to the expectation of a monomial in an arbitrary number of components of a zero mean Gaussian vector X \in \mathbb{R}^d in terms of the entries of its covariance matrix only.

We introduce some notation (as in Reference 2) to state the theorem succinctly. Let A = \{ \alpha_1, \dots, \alpha_N \} be a set of integers such that 1 \leq \alpha_i \leq d for all i. The \alpha_i need not be distinct. For any vector X \in \mathbb{R}^d, denote

\begin{aligned} X_A = \prod_{\alpha_i \in A} X_{\alpha_i}, \end{aligned}

with the convention that X_\emptyset = 1. Let \Pi (A) denote the set of all pairings of A, i.e. partitions of A into disjoint pairs. For a pairing \sigma \in \Pi (A), let A / \sigma denote the set of indices i such that the pairs in \sigma are \{ (\alpha_i, \alpha_{\sigma(i)}) : i \in A / \sigma \}.

(As an example, if A = \{ 1, 2, 3, 4 \}, one possible pairing \sigma is \{\{ 1, 3\}, \{ 2, 4\} \}. For this pairing, a possible choice of A / \sigma is A / \sigma = \{ 1, 2\}, with \sigma(1) = 3 and \sigma(2) = 4.)

We are now ready to state the theorem:

Theorem (Isserlis’ theorem): Let A = \{ \alpha_1, \dots, \alpha_N \} be a set of integers such that 1 \leq \alpha_i \leq d for all i, and let X \in \mathbb{R}^d be a Gaussian vector with zero mean. If N is even, then

\begin{aligned} \mathbb{E} [X_A] = \sum_{\sigma \in \Pi (A)} \prod_{i \in A / \sigma} \mathbb{E} [X_{\alpha_i} X_{\alpha_{\sigma(i)}}]. \end{aligned}

If N is odd, then \mathbb{E}[X_A] = 0.

Here are some special cases of Isserlis’ theorem to demonstrate how to interpret the equation above. If \alpha_i = i for 1 \leq i \leq 4, there are 3 possible pairings, giving us

\begin{aligned} \mathbb{E}[X_1 X_2 X_3 X_4] = \mathbb{E}[X_1 X_2] \mathbb{E}[X_3 X_4] + \mathbb{E}[X_1 X_3] \mathbb{E}[X_2 X_4] + \mathbb{E}[X_1 X_4] \mathbb{E}[X_2 X_3]. \end{aligned}

If we take \alpha_i = 1 for 1 \leq i \leq 4, there are still 3 possible pairings, and we get

\begin{aligned} \mathbb{E}[X_1 X_1 X_1 X_1] &= \mathbb{E}[X_1 X_1] \mathbb{E}[X_1 X_1] + \mathbb{E}[X_1 X_1] \mathbb{E}[X_1 X_1] + \mathbb{E}[X_1 X_1] \mathbb{E}[X_1 X_1], \\  \mathbb{E}[X_1^4] &= 3 \left(\mathbb{E}[X_1^2] \right)^2. \end{aligned}

This tells us that the 4th moment of a mean-zero 1-dimensional gaussian random variable is 3 times the square of its 2nd moment.

As a final example, if we take \alpha_1 = \alpha_2 = 1 and \alpha_3 = \alpha_4 = 2, we still have 3 possible pairings, and we get

\begin{aligned} \mathbb{E}[X_1 X_1 X_2 X_2] &= \mathbb{E}[X_1 X_1] \mathbb{E}[X_2 X_2] + \mathbb{E}[X_1 X_2] \mathbb{E}[X_1 X_2] + \mathbb{E}[X_1 X_2] \mathbb{E}[X_1 X_2], \\  \mathbb{E} [X_1^2 X_2^2] &= \mathbb{E}[X_1^2] \mathbb{E}[X_2^2] + 2 \left( \mathbb{E}[X_1 X_2] \right)^2. \end{aligned}


  1. Isserlis, L. (1918) On a formula for the product-moment coefficient of any order of a normal frequency distribution in any number of variables.
  2. Vignat, C. (2011) A generalized Isserlis theorem for location mixtures of Gaussian random vectors.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s