If \( X_1, \dots, X_n \sim N(\mu, \sigma^2) \), then \( \bar{X} \sim N(\mu, \sigma^2/n) \).
A 95% confidence interval for \( \mu \) is:
\[ \bar{X} \pm 1.96 \cdot \frac{\sigma}{\sqrt{n}} \]
Explanation:
Population: US girls’ weight \( X_p \sim N(\mu_p = 100, \sigma_p = 25) \)
Sample: Erie County, \( n = 100 \), \( \bar{X}_E = 95 \)
Assume \( \sigma_E = \sigma_p = 25 \), then:
\[ \bar{X}_E \pm 1.96 \cdot \frac{25}{\sqrt{100}} = 95 \pm 4.9 \Rightarrow 90.1 < \mu_E < 99.9 \]
For \( Z \sim N(0,1) \):
\[ P(-z_{\alpha/2} < Z < z_{\alpha/2}) = 1 - \alpha \Rightarrow \bar{X} \pm z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} \]
We want ME < 1:
\[ \text{ME} = z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} \Rightarrow (1.645)(25)/\sqrt{n} < 1 \Rightarrow n > 1692 \]
Conclusion: Need sample size \( n > 1692 \) to keep margin of error < 1 at 90% confidence.
When \( \sigma \) is unknown, use sample standard deviation \( S \):
\[ S = \sqrt{\frac{1}{n-1} \sum_{i=1}^n (X_i - \bar{X})^2} \]
Then:
\[ \frac{\bar{X} - \mu}{S / \sqrt{n}} \sim t(n - 1) \quad \text{(Student's t-distribution)} \]
A \( (1 - \alpha) \times 100\% \) confidence interval is:
\[ \bar{X} \pm t_{1 - \alpha/2}(n-1) \cdot \frac{S}{\sqrt{n}} \]
Given two samples \( X_1, \dots, X_{n_1} \sim N(\mu_1, \sigma_1^2) \) and \( Y_1, \dots, Y_{n_2} \sim N(\mu_2, \sigma_2^2) \), we estimate:
Then the t-statistic is:
\[ T = \frac{(\bar{X} - \bar{Y}) - (\mu_1 - \mu_2)}{\sqrt{\frac{S_1^2}{n_1} + \frac{S_2^2}{n_2}}} \]
Using Welch’s approximation, degrees of freedom \( \nu \) is:
\[ \nu = \frac{\left( \frac{S_1^2}{n_1} + \frac{S_2^2}{n_2} \right)^2}{ \frac{ \left( \frac{S_1^2}{n_1} \right)^2 }{n_1 - 1} + \frac{ \left( \frac{S_2^2}{n_2} \right)^2 }{n_2 - 1} } \]
Thus, the \( 95\% \) CI for \( \mu_1 - \mu_2 \) is:
\[ (\bar{X} - \bar{Y}) \pm t_{0.975}(\nu) \cdot \sqrt{\frac{S_1^2}{n_1} + \frac{S_2^2}{n_2}} \]