Comparing Samples¶
\(t\) Distribution DOF¶
Comparing Means¶
Using central-limit theorem, sampling distribution’s mean is normally-distributed, else t-distributed
\(\hat \sigma_1 \ne \hat \sigma_2\)¶
Given
- \((\bar x_1, s_1)\)
- \((\bar x_2, s_2)\)
Simplification: Is \(\mu_1\) and \(\mu_2\) statistically different? \(\implies (\hat \mu_1 - \hat \mu_2)=0\)
\(\hat \sigma_1 = \hat \sigma_2\)¶
Pooled samples: If we are confident that the population variance are same, we can pool all data to make one estimate of the population variance
Pairing¶
Matched Samples
Compare samples before and after treatment $$ d_i = y_{i, T=1} - y_{i, T=0} $$ \(T=\) treatment variable $$ \begin{aligned} z &= \dfrac{ \bar d - \hat \mu_d }{ s_d/\sqrt{n} } \end{aligned} $$
Inference¶
- If \(z\) or \(t\) within 95% 2-sided confidence interval centered around 0, then both series are similar
- Else, dissimilar
Comparing Variances¶
Assumes that the population distribution is Normal
There is no central-limit theorem that can be applied here $$ \begin{aligned} F &= \dfrac{s2_1/\sigma2_1}{s2_2/\sigma2_2} \ & \sim F(n_1-1, n_2-1) \end{aligned} $$
Correct Sampling¶
- Random sampling: When evaluating treatment, every subject must have equal probability of receiving treatment
- Equal sample sizes fore each treatment products optimal test
- Pairing can be used eliminate effect of uncontrolled variable
Standard error of mean¶
Error bars overlap | Error bars contain both the sample means | Inference |
---|---|---|
âś… | âś… | Strong evidence that populations are not different |
✅ | ❌ | No strong evidence that populations are not different |
❌ | ❌ | Strong evidence that populations are different |