# What is principal stratification in causal inference?

Principal stratification was proposed in Frangakis & Rubin (2002) (Reference 1) as a way to make sense of treatment effects that are adjusted for post-treatment variables. The exposition below largely follows that in their paper.

Background

Imagine that we are in the potential outcomes framework for causal inference (the notation here follows that of Reference 1):

• We index individuals by $i$.
• Each individual is either in control ( $Z_i = 1$) or treatment ( $Z_i = 2$).
• We have some outcome/response metric of interest. If the individual is in control (treatment resp.), the response metric is $Y_i(1)$ ( $Y_i(2)$ resp.). We only get to observe one of them, which we denote by $Y_i^{obs} = Y_i(Z_i)$.

We are interested in a comparison between the two sets $\{ Y_i(1): i \in \text{set}_1\}$ and $\{ Y_i(2): i \in \text{set}_2\}$,

where $\text{set}_1$ and $\text{set}_2$ are identical. We call such comparisons causal effects. For example, the average treatment effect is a comparison of the means of these two sets where $\text{set}_1$ and $\text{set}_2$ are the entire population, and so is a causal effect.

Adjusting for treatment effects: An incorrect approach

In addition to the above, assume that after each unit is assigned to one of the treatment arms, we also measure a post-treatment variable $S_i^{obs}$. For simplicity, assume that $S_i^{obs}$ is binary, taking on the value 1 or 2. (The framework can be extended easily to multi-category or continuous post-treatment variables.) $S_i^{obs}$ typically encodes characteristics of the unit and of the treatment, so we may want to “adjust” the causal effect to take this into account.

One possible approach (suggested and used fairly widely in the 1990s, sometimes known as “net treatment effects”) is to make a comparison between the distributions $\text{pr} \left\{ Y_i^{obs} \mid S_i^{obs} = s, Z_i = 1 \right\}$ and $\text{pr} \left\{ Y_i^{obs} \mid S_i^{obs} = s, Z_i = 2 \right\}$.

However, Reference 1 notes that such comparisons cannot be considered causal effects at all! The key to understanding this is to recognize that because $S_i^{obs}$ is a post-treatment variable, it could be affected by the treatment. Thus, we should recognize that there are potential outcomes for $S_i$ as well ( $S_i(1)$ and $S_i(2)$), and that $S_i^{obs} = S_i(Z_i)$.

With this language, consider the special case where treatment assignment is completely randomized, i.e. $\text{pr} (Z_i = 1 \mid S_i(1), S_i(2), Y_i(1), Y_i(2)) = p$ for some $0 < p < 1$. Under this assignment, the comparison above is equivalent to the comparison between $\text{pr} \left\{ Y_i(1) \mid S_i(1) = s \right\}$ and $\text{pr} \left\{ Y_i(2) \mid S_i(2) = s \right\}$.

In general, the groups $\{i: S_i(1) = s\}$ and $\{i: S_i(2) = s\}$ are not the same, so the comparison is not a causal effect!

Adjusting for treatment effects: Principal stratification

Frangakis & Rubin propose that we should adjust for post-treatment variables by thinking of treatment effects for each principal strata. Individuals are placed into different principal strata based on their vector of potential outcomes for $S$. Here is a formal definition:

Definition (Principal stratification). The basic principal stratification $P_0$ w.r.t. post-treatment variable $S$ is the partition of units such that, within any set of $P_0$, all units have the same vector $(S_i(1), S_i(2))$.

A principal stratification $P$ w.r.t. $S$ is a partition of units whose sets are unions of sets in the basic principal stratification $P_0$.

This allows us to define treatment effects that are always causal effects (as compared to the net treatment effects from the previous section):

Definition (Principal effects). Let $P$ be a principal stratification w.r.t. $S$ and let $S_i^P$ be the stratum of $P$ that unit $i$ belongs to. For a given principal stratum $\zeta \in P$, a principal effect for $\zeta$ is defined as a comparison between the sets $\left\{ Y_i(1): S_i^P = \zeta \right\}$ and $\left\{ Y_i(2): S_i^P = \zeta \right\}$.

Theorem. Principal effects are always causal effects.

The main benefit of principal stratification is that it gives us conceptual clarity of the estimand: what is it that we are actually trying to estimate? If the quantity that we are estimating does not make sense, then strategies to estimate those quantities are misguided.

Principal stratification gives us quantities that we can claim to be causal effects. Unfortunately, in general we do not know which principal stratum a unit belongs to since we can only observe one of the potential outcomes for $S_i$. What we could do is to treat it as a missing data problem and use techniques from the missing data literature to help us out.

Connection with non-compliance

Imagine that $S_i$ is the post-treatment variable indicating whether unit $i$ actually took the treatment or not (1 if they did not take the treatment, 2 if they did take the treatment). We can then divide the population into 4 principal strata, as shown in the diagram below: Principal effects are defined for each of the 4 groups. You may recognize the treatment effect for the compliers as the complier average causal effect (CACE), which is the typical estimand in instrumental variables studies. Hence, the CACE is a principal effect, and any techniques we use to estimate principal effects can be used to estimate the CACE.

References:

1. Frangakis, C. E., and Rubin, D. B. (2002). Principal stratification in causal inference.