CVaR | Statistical Odds & Ends

In this previous post, I introduced conditional value-at-risk (CVaR), a risk measure used in mathematical finance. $\alpha$ -CVaR is the expected value of the loss conditional on the loss being greater than the $\alpha$ -VaR. If $X$ is the random variable for the loss and $q_\alpha (X)$ is the $\alpha$ -VaR (i.e. $\alpha$ -quantile of $X$ ), then

$\begin{aligned} \alpha\text{-CVaR} = \mathbb{E} \left[ X \mid X \geq q_\alpha (X) \right] .\end{aligned}$

Lemma from Rockafellar & Uryasev

It turns out that the CVaR can be thought of as the solution to a particular minimization problem. This formulation first appeared in Rockafellar & Uryasev (2000) (Reference 1), and I’ll present a simplified version here.

Denote the $\alpha$ -CVaR of $X$ by $\phi_\alpha(X)$ , and assume that $X$ has a probability density function $p(\cdot)$ . (Reference 1 makes a similar assumption, but this assumption is more for mathematical simplicity rather than being a substantial blocker.) Let $x_+ = \max(0, x)$ denote the ReLU function.

Lemma (Theorem 1 of Rockafellar & Uryasev (2000)). $\phi_\alpha (X)$ satisfies

$\begin{aligned} \phi_\alpha (X) = \min_{x \in \mathbb{R}} \left[ x + \dfrac{1}{1-\alpha}\int_{-\infty}^\infty (y - x)_+ p(y) dy \right]. \end{aligned}$

While this lemma seems obscure, there have been applications. Reference 1 itself uses this formulation for minimizing CVaR, and see Reference 2 for a more modern application.

Where does this problem come from?

While interesting, the minimization problem seems to come out of nowhere. How does one even think to minimize such an expression as the one of the RHS?

I spent some time thinking about it and I think I have a plausible explanation. First, let’s rewrite the expression for CVaR:

$\begin{aligned} \phi_\alpha (X) &= \mathbb{E} \left[ X \mid X \geq q_\alpha (X) \right] \\ &= \dfrac{1}{\mathbb{P}\{X \geq q_\alpha (X)\}} \int_{q_\alpha(X)}^\infty y p(y) dy \\ &= \dfrac{1}{1-\alpha} \int_{q_\alpha(X)}^\infty [q_\alpha(X) + (y - q_\alpha(X))] p(y) dy \\ &= \dfrac{1}{1-\alpha}\left[ q_\alpha (X) (1-\alpha) + \int_{q_\alpha(X)}^\infty (y - q_\alpha(X)) p(y) dy \right] \\ &= q_\alpha (X) + \frac{1}{1-\alpha} \int_{-\infty}^\infty (y- q_\alpha(X))_+ p(y) dy. \end{aligned}$

This is exactly the RHS of the lemma, except that instead of minimizing over $x$ , we just plug in $q_\alpha(X)$ for $x$ .

This explains why we might think about the expression on the RHS, but why should $\alpha$ -CVaR be the solution where we replace $q_\alpha (X)$ with $x$ , then minimize over $x$ ? The key to that mystery is that quantiles can be expressed as the solution of a minimization problem. In particular, if we define $\rho_\alpha (y) = y(\alpha - 1\{ y < 0 \})$ , then the $\alpha$ -quantile of $X$ , $q_\alpha (X)$ , satisfies

$\begin{aligned} q_\alpha(X) &= \underset{x \in \mathbb{R}}{\text{argmin}} \;\mathbb{E}[\rho_\alpha(X - x)]. \end{aligned}$

(This previous post presents this idea but for the empirical distribution of a sample. Note that if we define $L(y,z) = \rho_\alpha(y - z)$ , then $L$ is the pinball loss associated with quantile regression.) Let’s rewrite the expectation on the RHS more explicitly:

$\begin{aligned} \mathbb{E}[\rho_\alpha(X - x)] &= \mathbb{E}[(X - x)(\alpha - 1\{ X - x < 0 \})] \\ &= \int_{-\infty}^\infty (y - x)(\alpha - 1\{ y < x \}) p(y)dy \\ &= \int_{-\infty}^\infty (y-x)\alpha p(y) dy + \int_{-\infty}^\infty (y - x)(- 1 + 1 \{ y \geq x \}) p(y)dy \\ &= \int_{-\infty}^\infty (y-x)(\alpha-1) p(y) dy + \int_{-\infty}^\infty (y - x)1 \{ y \geq x \} p(y)dy \\ &= (\alpha-1) \mathbb{E}[X] - x(\alpha-1) + \int_{-\infty}^\infty (y-x)_+ p(y) dy \\ &= (\alpha - 1) \mathbb{E}[X] + (1 - \alpha) \left[ x + \frac{1}{1-\alpha} \int_{-\infty}^\infty (y-x)_+ p(y) dy \right] \end{aligned}$

We recognize the expression inside the square brackets as the expression we are trying to minimize over to get the $\alpha$ -CVaR! In summary,

$\begin{aligned} \underset{x \in \mathbb{R}}{\text{argmin}} \;\mathbb{E}[\rho_\alpha(X - x)] &= q_\alpha(X), \\ \underset{x \in \mathbb{R}}{\min} \;\mathbb{E}[\rho_\alpha(X - x)] &= (1-\alpha) \left\{ \phi_\alpha (X) - \mathbb{E}[X] \right\}. \end{aligned}$

Minimizing the expression in the lemma to get the $\alpha$ -CVAR is equivalent to minimizing a linear transformation of the pinball loss to get the $\alpha$ -quantile (or $\alpha$ -VaR).

Credit: I learnt of CVaR and this lemma through a talk that Stefan Wager gave recently at a reading group. One of his students, Roshni Sahoo, used this lemma as the basis for a new method for learning from a biased sample (see Reference 2).

References:

Rockafellar, R. T., and Uryasev, S. (2000). “Optimization of conditional value-at-risk.“
Sahoo, R., et al. (2022). “Learning from a Biased Sample.“

In this previous post, we defined Value at Risk (VaR): given a time horizon $T$ and a level $\alpha$ , the VaR of an investment at level $\alpha$ over time horizon $T$ is a number or percentage X such that

Over the time horizon $T$ , the probability that the loss on the investment is $X$ or more is $1 - \alpha$ .

$\alpha$ -VaR is the $\alpha$ -quantile of $X$ , where $X$ is the distribution of the loss over time horizon $T$ .

Why looking at VaR isn’t enough

VaR helps us to understand the tail risk of the investment (in general the larger the VaR, the riskier the investment), but it doesn’t capture everything we need to know about tail risk. Consider two investments with the loss distributions shown in the figures below. Both of them have VaR at level 0.95 equal to 5. However, investment 1 can only lose up to 7% while investment 2 can lose up to 15%. In general most investors would prefer investment 1 to investment 2.

Conditional Value at Risk (CVaR)

While VaR is unable to distinguish between the two investments above at the 95% level, conditional VaR (CVaR) is able to do so. Let’s call the tail risk a Conditional VaR at level $\alpha$ answers the question:

Let’s say I was unlucky and fell into the worst $1-\alpha$ of outcomes. Assuming that I’m in this set of bad outcomes, what is my expected loss?

In mathematical terms, CVaR is a conditional expectation: $\alpha$ -CVaR is the expected value of the loss conditional on the loss being greater than the $\alpha$ -VaR. If $X$ is the random variable for the loss and $q_\alpha (X)$ is the $\alpha$ -VaR, then

$\begin{aligned} \alpha\text{-CVaR} = \mathbb{E} \left[ X \mid X \geq q_\alpha (X) \right] .\end{aligned}$

In the example in the previous section, investment 1 had a 0.95-CVaR of $5 \frac{2}{3}$ while investment 2 had a 0.95-CVaR of $8\frac{1}{3}$ , which reflects our intuition that investment 2 is a riskier investment.

CVaR for common probability distributions

Norton et al. (2019) (Reference 3) provide formulas for some of the common probability distributions. Here are some of them (see the paper for the full list with proofs in the Appendix):

If $X$ has exponential distribution with rate parameter $\lambda$ , then

$\begin{aligned} \alpha\text{-CVaR} = \dfrac{-\log (1 - \alpha) + 1}{\lambda}. \end{aligned}$

If $X$ has Pareto distribution with shape parameter $a > 1$ and scale parameter $x_m$ , then

$\begin{aligned} \alpha\text{-CVaR} = \dfrac{x_m a}{(1 - \alpha)^{1/a}(a-1)}. \end{aligned}$

(If $a \in (0, 1]$ , then $\alpha\text{-CVaR} = \infty$ .)

If $X$ has Laplace distribution with location parameter $\mu$ and scale parameter $b$ , then

$\begin{aligned} \alpha\text{-CVaR} = \begin{cases} \mu + b \left( \frac{\alpha}{1-\alpha}\right) [1 - \log (2 \alpha)] &\text{if } \alpha < 1/2, \\ \mu + b \left[ 1 - \log [2(1-\alpha)] \right] &\text{if } \alpha \geq 1/2. \end{cases} \end{aligned}$

If $X$ has normal distribution with mean $\mu$ and variance $\sigma^2$ , then

$\begin{aligned} \alpha\text{-CVaR} = \mu + \sigma \dfrac{f \left( q_\alpha \left( \frac{X - \mu}{\sigma}\right) \right)}{1 - \alpha}, \end{aligned}$

where $f(\cdot)$ is the PDF of the standard normal distribution and $q_\alpha(\cdot)$ is the $\alpha$ -quantile of the standard normal distribution.

If $X$ has lognormal distribution with parameters $\mu$ and $s$ , then

$\begin{aligned} \alpha\text{-CVaR} = \dfrac{1}{2}e^{\mu + \frac{s^2}{2}} \dfrac{1 + \text{erf}\left( \frac{s}{\sqrt{2}} - \text{erf}^{-1}(2\alpha - 1) \right)}{1-\alpha}, \end{aligned}$

where $\text{erf}(\cdot)$ is the error function.

If $X$ has logistic distribution with location parameter $\mu$ and scale parameter $s$ , then

$\begin{aligned} \alpha\text{-CVaR} = \mu + \dfrac{s H(\alpha)}{1 - \alpha}, \end{aligned}$

where $H(\alpha) = - \alpha \log \alpha - (1-\alpha) \log (1-\alpha)$ is the binary entropy function.

If $X$ has $t$ -distribution with $\nu$ degrees of freedom, location parameter $\mu$ and scale parameter $s$ , then

$\begin{aligned} \alpha\text{-CVaR} = \mu + s \left( \dfrac{\mu + T^{-1}(\alpha)^2}{(\nu - 1)(1-\alpha)} \right) \tau \left( T^{-1}(\alpha)\right), \end{aligned}$

where $T^{-1}(\cdot)$ is the inverse of the standardized $t$ -distribution’s CDF (i.e. $\mu = 0$ and $s = 1$ ), and $\tau(\cdot)$ is the PDF of the standardized $t$ -distribution.