Rubin Model¶

Also called as Potential Outcomes Framework

We find the ‘treatment effect’ of \(x\). This is just a fancy way of saying causal effect

Uses statistical analysis of experiments to model causality

Framework for causal inference that conceptualizes observed data as if they were outcomes of experiments, conducted through 1. actual experiments by researcher(s) 2. observational studies by subjects of the research

Terms¶

	Keyword	Meaning
\(x\)	Treatment/ Intervention/ Mediation	input
\(y\)	Outcome	output
\(x \perp (y^0, y^1)\)	Exchangeability/Exogenous	input is independent
\(x \perp \!\!\! \perp (y^0, y^1)\)	Conditional exchangeability	input is independent only for a certain sub-population
	Endogeneous	input is dependent (self-chosen)

ill-Defined Intervention¶

When the treatment is not defined specifically, there exists multiple variations of the treatment. Hence, derived effect will not be meaningful, and may be misleading.

Effect of democracy on economic growth¶

You need to keep in mind that there are multiple variations of

democracy - parliamentary, presidential, …
country becoming democratic - peaceful transition, civil uprising, revolt, …

In this case, the ‘effect of democracy on economic growth’ will not be meaningful, as each of these various treatments will have different outcomes, and cannot be generalized.

Effect of obesity on health¶

What is obesity as a treatment?
How do we intervene on obesity?
Multiple channels to becoming obese or un-obese: (lack of) exercise, (un)healthy diet, surgery, ...
The apparently straightforward comparison of the health outcomes of obese and non-obese individuals masks the true complexity of the interventions “make someone obese” and “make someone non-obese.”

Potential Outcomes¶

Consider an input \(x_i\) which takes binary values \(0/1\). Then, there will be

4 potential outcomes
2 potential outcomes for each treatment

Suppose the treatment is \(x_1\), then


\(x = a\)	actual treatment
\(x \ne a\)	counterfactual treatment
\(y^a\)	realized outcome
\(y^{\ne a}\)	counterfactual outcome
\(\{ y_i^a , y_i^{\ne a} \}\)	potential outcomes

\[ \begin{aligned} y &= \sum_{a=1}^A y^a I(x=a) \\ \text{Binary } x \implies y &= x y_i^1 + (1-x) y_i^0 \end{aligned} \]

\(x\) has causal effect on \(y\) \(\iff P(y^0) \ne P(y^1)\), where P is the probability. This is because

if \(x\) has no effect, changing it won’t have any effect on the probability of either outcome, so the probabilities will be equal.
but if it has effect, then obviously the outcome probabilities will be different

Shortcomings¶

Since it is more experiment-oriented, it is hard to analyze continuous treatment. It is only feasible to do binary \(0/1\) treatment.
Cannot learn individual treatment effects, since counterfactual outcomes are not observed. This is the fundamental problem of causal inference
We can only learn causal effects at ~~population~~ sample level
Therefore, when learning a causal effect, we should always be clear about the ~~population~~ sample on which it is defined
According to Rubin, causal inference is a ‘missing data’ problem, but that’s just like every other statistical predictive model
It does not model choice as assignment of unit’s ability and eligibility for treatment; it models model choice as assignment to a treatment

Last Updated: 2024-12-26 ; Contributors: AhmedThahir, web-flow