# How heavy-tailed is the t distribution?

It’s well-known that the $t$ distribution has heavier tails than the normal distribution, and the smaller the degree of freedom, the more “heavy-tailed” it is. As the degrees of freedom goes to 1, the $t$ distribution goes to the Cauchy distribution, and as the degrees of freedom goes to infinity, it goes to the normal distribution.

One way to measure the “heavy-tailedness” of a distribution is by computing the probability of the random variable taking a value that is at least $x$ standard deviations (SD) away from its mean. The larger those probabilities are, the more heavy-tailed a distribution is.

The code below computes the (two-sided) tail probabilities for the $t$ distribution for a range of degree of freedom values. Because the probabilities are so small, we compute the log10 of these probabilities instead. Hence, a value of -3 corresponds to a probability of $10^{-3}$, or a 1-in-1,000 chance.

library(ggplot2)

dfVal <- c(Inf, 100, 50, 30, 10, 5, 3, 2.1)
sdVal <- 1:10

tbl <- lapply(dfVal, function(df) {
stdDev <- if (is.infinite(df)) 1 else sqrt(df / (df - 2))
data.frame(df = df,
noSD = sdVal,
log10Prob = log10(2 * pt(-(sdVal) * stdDev, df = df)))
})

tbl <- do.call(rbind, tbl)
tbl$df <- factor(tbl$df)

ggplot(tbl, aes(x = noSD, y = log10Prob, col = df)) +
geom_line(size = 1) +
scale_color_brewer(palette = "Spectral", direction = 1) +
labs(x = "No. of SD", y = "log10(Probability of being >= x SD from mean)",
title = "Tail probabilities for t distribution",
col = "Deg. of freedom") +
theme_bw()


Don’t be fooled by the scale on the vertical axis! For a $t$ distribution with 3 degrees of freedom, the probability of being 10 SD out is about 1-in-2,400. For a normal distribution (inifinite degrees of freedom in the figure), that same probability is about 1-in-65,000,000,000,000,000,000,000! (That’s 65 followed by 21 zeros. As a comparison, the number of stars in the universe is estimated to be around 10^24, or 1 followed by 24 zeros.)

If you look closely at the figure, you might notice something a little odd with $df = 2.1$: it seems that for any number of SDs, the probability of being that number of SD out for $df = 2.1$ is lower than that for $df = 3$. Does that mean that $df = 2.1$ is less heavy-tailed than $df = 3$?

Not necessarily. A $t$ distribution with $\nu$ degrees of freedom has SD $\sqrt{\nu / (\nu - 2)}$. For $\nu = 3$, the SD is about 1.73 while for $\nu = 2.1$ the SD is about 4.58, much larger! Taking this to the extreme, consider a $t$ distribution with $\nu = 2$. The variance is infinite in this case, so the random variable always takes values within 1 SD of the mean! Does it mean that this distribution is less heavy tailed than the normal distribution?

Looks like we might need another way to define heavy-tailedness!

Update (2021-11-06): This blog post contains a nice discussion on some of the weirdness we see when the degrees of freedom for the $t$ distribution is between 2 and 3.