In this previous post, we explored how heavy-tailed the distribution is through the question: “What is the probability that the random variable is at least x standard deviations (SDs) away from the mean?” For the most part, the smaller the degrees of freedom, the larger this probability was (more “heavy-tailed”), until we realized that the trend reversed for really small degrees of freedom (2.1 in the post). In fact, for
, the variance of the
distribution is infinite, and so the random variable is always within 1 SD of the mean!
We need another way to think about heavy-tailedness. (The code to produce the figures in this post is available here.)
A first approach that doesn’t work
You might be wondering, why didn’t I just plot with
against
, for various values of
and
? If I did that, I would have ended up with the plot below (for the log of the probabilities):
That seems to be exactly what we want: the smaller the degrees of freedom, the slower this probability decays…
The problem is that the comparison above ignores the scale of the random variables. Imagine if we tried to make the plot above, but instead of plotting lines for the distribution with different degrees of freedom, let’s plot it for the normal distribution with different standard deviations. This is what we would get:
That seems to give the same trend as the plot before! Can we then conclude that the distribution is more heavy-tailed than the
distribution??
One way to incorporate scale
The discussion above illustrates the need to take scale into account. We tried to do this in the previous post by scaling each distribution by its own SD, but that idea broke down for small degrees of freedom.
Here’s an idea: Pick some threshold . For each random variable
, find the scale factor
such that
. For this value of
,
and
are on the same scale w.r.t. this threshold. We then compare the tail probabilities of
and
(instead of
and
).
Finding is not hard: here’s a three-line function that does it for the
distribution in R:
getScaleFactor <- function(df, threshold) { tailProb <- pnorm(threshold, lower.tail = FALSE) tQuantile <- qt(tailProb, df = df, lower.tail = FALSE) return(threshold / tQuantile) }
Let’s plot the log10 of the tail probability with
against
for various values of
and
, with the scale factor
computed as above:
By definition, the tail probabilities will coincide when is equal to the threshold used to compute the scale factors. We now see a clear trend with no breakdown: for smaller values of
, the tail probability
is larger.
Another side benefit of this way to looking at tail probabilities is that we can now compare distributions which have infinite variance, or even an undefined mean (like the Cauchy distribution, which is the t distribution with one degree of freedom)! Here is the same plot as above but for smaller degrees of freedom: