A survey asked 200 students (HS and College) whether they prefer Instagram or Snapchat:
Group | Insta | Snap | Total | % Insta |
---|---|---|---|---|
HS | 34 | 66 | 100 | 34% |
College | 52 | 48 | 100 | 52% |
Total | 86 | 114 | 200 | 43% |
Question: Are the differences in Instagram preference due to chance?
Hypotheses:
If \( H_0 \) is true, then all groups should have the same Instagram preference: 43%.
Expected values assuming independence:
Group | Insta | Snap | Total |
---|---|---|---|
HS | 43 | 57 | 100 |
College | 43 | 57 | 100 |
Total | 86 | 114 | 200 |
If \( H_0 \) is true, then observed counts should be close to expected.
If group and preference are independent, then any permutation of preferences is equally likely.
Steps:
p-value: \( p = P(C \ge C_{\text{obs}}) \)
If \( p \) is small, reject independence.
Chi-squared statistic:
\[ C = \sum \frac{(O - E)^2}{E} \]
Example: for observed vs expected counts:
\[ C = \frac{(34 - 43)^2}{43} + \frac{(66 - 57)^2}{57} + \frac{(52 - 43)^2}{43} + \frac{(48 - 57)^2}{57} = 6.6 \]
Large \( C \Rightarrow \) large deviation between observed and expected → group matters.
Distribution under \( H_0 \): Chi-square with degrees of freedom:
\[ df = (r - 1)(c - 1) = (2 - 1)(2 - 1) = 1 \]
p-value:
\[ P(C \ge 6.6) = 0.01 \]
Conclusion: Reject \( H_0 \): group affects preference.