Skip to content

Treatment Effect

Effects

Let \(\tau = y^1 - y^0\).

\(\tau\) will have a distribution because it is a random variable (\(\tau_1, \tau_2, \dots, \tau_i\))

Term Meaning
ITE \(\tau_i\) Individual Treament Effect
it is never observed, because we only observe \(y_i^0\) or \(y_i^1\)
ATE \(E[\tau]\) Average Treatment Effect
CATE \(E[\tau \vert Z=z]\) Conditional ATE
LATE \(E[\tau \vert Z \in z]\) Local ATE
ATT \(E[\tau \vert x = 1]\) Average Treatment effect on Treated
ATU \(E[\tau \vert x = 0]\) Average Treatment effect on Untreated
\(E(y^1)\) Expectation of outcome 1 of the entire population (hypothetical, counterfactual)
\(E(y^1 \vert x = 1)\) Expectation of outcome 1 of the treated sample
ITT \(E[y \vert \text{Treatment group}] - E[y \vert \text{Control group}]\)

\(\pi_C \text{CACE} + \pi_A \text{ATACE} + \pi_N \text{NTACE} + \pi_D \text{DACE}\)
\(=\pi_C \text{CACE} + \pi_A 0 + \pi_N 0 + 0 \text{DACE}\)
\(=\pi_C \text{CACE}\)
Intent to treat
Effect of assignment to treatment group (not actual treatment)
CACE \(E[\tau \vert \text{Complier}]\)
\(\dfrac{\text{ITT}}{\pi_C}\)
Complier Average Causal Effect
LATE for Compliers

$$ \begin{aligned} \pi_C + \pi_A &= p(T=1 \vert \text{Treatment group}) \ \pi_A + \pi_D &= p(T=1 \vert \text{Control group}) \ \pi_N + \pi_D &= p(T=0 \vert \text{Treatment group}) \ \pi_N + \pi_C &= p(T=0 \vert \text{Control group}) \end{aligned} $$ Usually, we assume that \(\pi_D=0\)

Why are there 3 different average variables? The people in each group is different. So, \(\tau\) for the entire group, treated and untreated groups are different, due to ‘selection effect’. This is like people who go to uni vs don’t. $$ \begin{aligned} y &= y^0 \cdot I(x=0) \times y^1 \cdot I(x=1)\ &\text{where \(I\) means if}\

\implies y &= f(x, y^0, y^1) \end{aligned} $$

ATE

  • ATE = difference of the mean = mean of the difference $$ \begin{aligned} \text{ATE} =& E[y^1] - E[y^0] \ =& E[y^1 - y^0] \end{aligned} $$
  • ATE = weighted average of ATT and ATU $$ \begin{aligned} \text{ATE} =& \text{ATT} \cdot P(x=1) + \text{ATU} \cdot P(x=0) \ =& E(y^1 - y^0 | x = 1) \cdot P(x=1) \ &+ E(y^1 - y^0 | x = 0) \cdot P(x=0) \ =& \int E[y^1 - y^0 \vert s] \cdot p(s) \cdot ds \end{aligned} $$ This reminds me of the total probability like in Bayes’ conditional probability. But here, we are taking expectation \(E\) (mean), because it’ll more accurate than taking one value from the PDF, as \(\tau\) is a random variable
  • ATE = Weighted average of CATE, given unconfoundedness: proving that treatment randomly assigned within each group \(z_i\)

$$ \widehat{\text{ATE}} = \sum_i \widehat{\text{CATE}}_i \cdot P(Z=z_i) $$ - ATE = ATT + Selection Bias - Randomization makes selection bias 0

ATE = Causal Effect $$ \begin{aligned} E(y \vert \text{do}(x), s) &= E(y \vert x, s) \ \implies \widehat{\text{ATE}}(x, s) &=\dfrac{d}{dx} \hat E[y \vert \text{do}(x)] \ &=\dfrac{d}{dx} \hat E[y \vert \text{do}(x), s] \ &= \dfrac{d}{dx} E_s \Big[ \hat E[y \vert \text{do}(x), s] \Big] \ &= \dfrac{d}{dx} E_s \Big[ \hat E[y \vert x, s] \Big] \ &= E_s \left[ \dfrac{\partial}{\partial x} \hat E[y \vert x, s] \right] \end{aligned} $$

HTE = Heterogeneous Treatment Effect = ATE with high dimensional \(s\)

If \(x\) is binary $$ \begin{aligned} x &\in { 0, 1 } \ \implies \text{ATE}(x, s) &= E[ y \vert \text{do}(x=1) ] - E[ y \vert \text{do}(x=0) ] \ &= E_s \Bigg[ E[y \vert x=1, s ] - E[ y \vert x=0, s] \Bigg] \ \end{aligned} $$

Treatment Model \(\widehat{\text{ATE}}(x)\)
Binary Linear \(\hat \beta_0 + \hat \beta_1 x + \hat \beta_2 s\) Constant \(\hat \beta_1\)
Multi-Level/
Continuous
Non-linear Functional

IDK

\(E[y^1 - y^0]\) \(E[y \vert x=1] - E[y \vert x=0]\)
Compares what would happen if the __ sample receives treatment \(x=1\) vs \(x=0\) same 2 different
Provides Average causal effect Average difference in outcome b/w sub populations defined by treatment group
image-20240320180040650 image-20240320180026379

IDK

$$ \begin{aligned}

\text{ATT} &= E(y^1 - y^0 | x = 1) \ &= \underbrace{E(y^1 | x = 1)}{E(y | x = 1)} - \underbrace{E(y^0 | x = 1)} \ \text{ATU} &= E(y^1 - y^0 | x = 0) \ &= \underbrace{E(y^1 | x = 0)}}{\text{Cannot be estimated}} - \underbrace{E(y^0 | x = 0)} \ %{ %%\text{Similarly,} &\ %%\text{ATE} %%&= \underbrace{E(y^1 | x = 0)}{\text{Cannot be estimated}} - %%\underbrace{E(y^0 | x = 0)} %} \end{aligned} $$

Solution: Randomized Treatment

PDF Graph

potentialOutcomesDistribution

$$ \begin{aligned} \text{ATE}(x) &= \int \tau(x, y) P(y) dy \ &= \frac{ dE[y| \text{ do}(x)] }{dx} \

T(x, y) &= \frac{ \partial P(y | \text{do}(x)) }{ \partial x } \ \end{aligned} $$

We could also interpret this entire distribution as a 3 variable joint PDF of the form \(P(x, y^0, y^1)\)

IDK

Treatment
\(x\)
Observed Outcome
\(y\)
Potential Outcome
\(y^0\)
Potential Outcome
\(y^1\)
ITE
0 -0.34 -0.34 3.46 3.8
0 1.67 1.67 4.03 2.36
0 -0.77 -0.77 3.08 3.85
0 2.64 2.64 0.90 -1.74
0 -0.02 -0.02 0.96 0.98
1 2.31 -1.52 2.31 3.83
1 2.79 1.05 2.79 1.74
1 1.53 -0.13 1.53 1.65
1 3.61 -1.41 3.61 5.02
1 3.36 0.60 3.36 2.76

Here

  • Modelling the ITE is correct
  • Modelling \(y\) vs \(x\) is incorrect

Probabilistic

Causal effect of a treatment is a probability distribution: it is not the same for every individual.

  • Learning the individual-level is nearly impossible
  • Learning the pdf of the effect is hard

Usually, the Average Treatment Effect is used

But better to compare the distributions of the treatment and control group

Heterogeneous Treatment Effects

If we consider effect modifier \(s\) $$ \begin{aligned} \text{ATE} &= \dfrac{\partial}{\partial x} E[ y \vert \text{do}(x), s] \ &= \int \limits_s E[y^1 - y^0 | s] \cdot P(s) \cdot ds \ \end{aligned} $$

where \(P(s)\) is the distribution of effect modifier.

Then, the result of the randomized test actually gives us \(E[\tilde \tau]\), which may not be equal to \(E[\tau]\)

Example

For example, the yield depends on the season and the crop for which was grown previous year. Let’s take the example of a a randomized test of a fertilizer used in the summer.

If no crop was grown the previous year, then we

Crop Grown in Field Last Year Result of Randomized Test \(E[\tilde \tau]\)
No crop $E[\tau
Rice only $E[\tau
50% Barley, 50% Rice $0.5 \cdot E[\tau
Last Updated: 2024-12-26 ; Contributors: AhmedThahir, web-flow

Comments