Confidence Intervals for a Single Population Mean

In statistical inference, we use data from a sample to draw conclusions about the broader population. Two fundamental tasks of inference are point estimation (providing a single best guess of a population parameter) and interval estimation (providing a range of plausible values for that parameter). This chapter develops the concept of confidence intervals in detail, beginning with point estimators and their properties, then covering confidence interval methods for a population mean in two scenarios: when the population standard deviation is known and when it is unknown.

Diagram showing how statistics from a sample are used to estimate population parameters

Point Estimators and Bias

Definition (Point Estimator)

A point estimator, denoted \(\hat{\theta}\) for a parameter \(\theta\), is a rule or formula that uses sample data to produce a single value estimate for an unknown population parameter.

Examples of point estimators include:

Sample mean \(\bar{x}\) for estimating population mean \(\mu\)
Sample proportion \(\hat{p}\) for estimating population proportion \(p\)
Sample variance \(s^2\) for estimating population variance \(\sigma^2\)

Each point estimate calculated from a sample is a random quantity. If we took different random samples, we would get different values of \(\bar{x}\), \(\hat{p}\), etc. This sampling variability means a point estimate rarely equals the true parameter value.

A desirable property of point estimators is unbiasedness:

Definition of bias and unbiased estimators

Definition (Unbiased Estimator)

An estimator \(\hat{\theta}\) for a parameter \(\theta\) is unbiased if its expected value equals the true parameter value: \(E[\hat{\theta}] = \theta\).

The bias of \(\hat{\theta}\) is defined as \(E[\hat{\theta}] - \theta\). If this bias is zero, the estimator is unbiased.

Most standard estimators are unbiased. For example, \(E[\bar{X}] = \mu\) (the sample mean is an unbiased estimator of the population mean), and the sample proportion \(\hat{p}\) is an unbiased estimator of the population proportion \(p\).

Visual representation comparing biased and unbiased estimators

However, some estimators are biased. The sample standard deviation \(s = \sqrt{\frac{1}{n}\sum_{i=1}^n (x_i - \bar{x})^2}\) is a biased estimator of the population standard deviation \(\sigma\). While \(s^2\) calculated with denominator \((n-1)\) is unbiased for \(\sigma^2\), taking the square root introduces bias.

Theorem (Bias of Sample Standard Deviation)

For a random sample \(X_1, X_2, \ldots, X_n\) with finite variance \(\sigma^2 > 0\), the expected value of the sample standard deviation \(S\) is less than \(\sigma\):

\(E[S] < \sigma\)

This means \(S\) is a biased estimator that tends to underestimate the true population standard deviation.

Unbiasedness is desirable, but it’s not the only important property. Among unbiased estimators, we prefer those with smaller variance. The estimator with minimum variance is particularly valuable:

Definition (Minimum Variance Unbiased Estimator)

The minimum variance unbiased estimator (MVUE) for a parameter \(\theta\) is the unbiased estimator with the lowest variance among all unbiased estimators.

For a normal population with mean \(\mu\), the sample mean \(\bar{X}\) is the MVUE for \(\mu\). This means if we can assume approximate normality or have large samples (where the Central Limit Theorem applies), \(\bar{X}\) is the most efficient estimator for the population mean.

A point estimate alone doesn’t convey uncertainty. If we report only “\(\bar{x} = 3.45\) seconds” as an estimate of a true mean, we have no sense of how close that estimate might be to the truth. This limitation motivates interval estimation - constructing a confidence interval around the point estimate to quantify uncertainty.

Point estimate with uncertainty visualization

Confidence Intervals for a Population Mean (σ Known)

A confidence interval (CI) provides a range of plausible values for the true mean \(\mu\) based on sample data, with a confidence level indicating the reliability of that range.

Definition (Confidence Interval)

A confidence interval is a range of values constructed from sample data with a specified probability (the confidence level) that the interval contains the true population parameter.

Deriving the Interval Using a Pivotal Quantity

To construct a confidence interval for \(\mu\), we need the sampling distribution of an appropriate statistic. We start with the case where the population standard deviation \(\sigma\) is known.

Assumptions

We have a simple random sample \(X_1, X_2, \ldots, X_n\) from a population with unknown mean \(\mu\) and known standard deviation \(\sigma\).
Either (a) the population is normally distributed, or (b) \(n\) is large enough for the Central Limit Theorem to apply.

Under these assumptions, the sample mean \(\bar{X}\) has a normal sampling distribution:

\(E[\bar{X}] = \mu\) (unbiasedness)
\(\text{Var}(\bar{X}) = \sigma^2/n\), so \(\text{SD}(\bar{X}) = \sigma/\sqrt{n}\) (the standard error of the mean)
\(\bar{X} \sim N(\mu, \sigma^2/n)\) exactly if the population is normal, or approximately by the CLT otherwise

The standardized variable

\[Z = \frac{\bar{X} - \mu}{\sigma/\sqrt{n}}\]

follows a standard normal distribution \(N(0,1)\). This \(Z\) is our pivotal quantity (a function of the data and the unknown parameter with a known distribution).

Definition (Pivotal Quantity)

A pivotal quantity is a function of the sample data and unknown parameter(s) that has a distribution independent of the parameter values. Here, \(Z = \frac{\bar{X}-\mu}{\sigma/\sqrt{n}}\) is pivotal because it follows \(N(0,1)\) regardless of \(\mu\).

For a \(100(1-\alpha)\%\) confidence level, we need to capture the central \(1-\alpha\) probability of the \(Z\) distribution:

\[P\left(-z_{\alpha/2} < \frac{\bar{X} - \mu}{\sigma/\sqrt{n}} < z_{\alpha/2}\right) = 1 - \alpha\]

where \(z_{\alpha/2}\) is the critical value such that \(P(Z \leq z_{\alpha/2}) = 1 - \alpha/2\).

Confidence coefficient and z-critical values illustration

Through algebraic manipulation:

\[\begin{split}\begin{align*} 1-\alpha &= P\left(-z_{\alpha/2} < \frac{\bar{X} - \mu}{\sigma/\sqrt{n}} < z_{\alpha/2}\right) \\ &= P\left(-z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} < \bar{X} - \mu < z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}}\right) \\ &= P\left(\bar{X} - z_{\alpha/2}\frac{\sigma}{\sqrt{n}} < \mu < \bar{X} + z_{\alpha/2}\frac{\sigma}{\sqrt{n}}\right) \end{align*}\end{split}\]

This gives us the basis for the confidence interval. Once we observe our sample and compute \(\bar{x}\), we have:

Formula (Two-Sided Confidence Interval for \(\mu\) with known \(\sigma\))

A \(100(1-\alpha)\%\) confidence interval for the population mean \(\mu\) is:

\[\bar{x} \pm z_{\alpha/2}\frac{\sigma}{\sqrt{n}}\]

Or written as a range:

\[\left[\bar{x} - z_{\alpha/2}\frac{\sigma}{\sqrt{n}}, \quad \bar{x} + z_{\alpha/2}\frac{\sigma}{\sqrt{n}}\right]\]

Common critical values:

For 90% CI: \(\alpha=0.10\) and \(z_{0.05} \approx 1.645\)
For 95% CI: \(\alpha=0.05\) and \(z_{0.025} \approx 1.96\)
For 99% CI: \(\alpha=0.01\) and \(z_{0.005} \approx 2.576\)

The correct interpretation of a “95% confidence interval” is not that “\(\mu\) has a 95% probability of lying in this particular interval” — \(\mu\) is fixed, not random. Rather, it means if we were to take many samples and build a 95% CI from each sample, about 95% of those intervals would contain the true \(\mu\).

Proper interpretation of confidence intervals

Example (Estimating a Mean with a Known σ)

Let’s illustrate with a practical example. Suppose an engineering team is investigating the average load time \(\mu\) of their website’s homepage. Historically, the standard deviation of load times is known to be \(\sigma = 1.0\) second (perhaps from a large amount of past data). The team deploys a new update to the site and now wants to estimate the new average load time.

(a) Planning for a Desired Precision: Before collecting the data, the team decides they want a margin of error no more than 0.2 seconds for a 95% confidence interval. How large a sample do they need?

This is a sample size determination problem. The margin of error (half the width of the CI) is \(E = z_{\alpha/2}\frac{\sigma}{\sqrt{n}}\). For 95% confidence, \(z_{0.025}=1.96\). We want \(E \leq 0.2\). So we solve:

\[1.96 \frac{\sigma}{\sqrt{n}} \leq 0.2\]

Plugging in \(\sigma = 1\):

\[1.96 \frac{1}{\sqrt{n}} \leq 0.2 \quad\implies\quad \frac{1.96}{0.2} \leq \sqrt{n} \quad\implies\quad \sqrt{n} \geq 9.8 \quad\implies\quad n \geq 96.04\]

So \(n\) needs to be at least 96.04. Since \(n\) must be an integer, we round up to 97 observations.

(b) Constructing the 95% CI: The team collects \(n=97\) page load times and finds \(\bar{x} = 3.45\) seconds. Now they construct the 95% confidence interval:

Standard error: \(\sigma/\sqrt{n} = 1/\sqrt{97} \approx 0.1015\) seconds
Critical value: \(z_{0.025} = 1.96\)
Margin of error: \(E = 1.96 \times 0.1015 \approx 0.199\) seconds
Confidence interval: \(3.45 \pm 0.199 = [3.25, 3.65]\) seconds

Interpretation: “We are 95% confident that the true average load time after the update is between 3.25 and 3.65 seconds.”

(c) Confidence Level and Interval Width: If the team wanted 99% confidence instead of 95%, the interval would widen to \(3.45 \pm 2.576(0.1015) = [3.19, 3.71]\) seconds. Higher confidence requires a wider interval.

Sample Size Planning

When designing a study, we often need to determine how many observations to collect to achieve a desired precision.

Precision of confidence intervals visualization

Formula (Required Sample Size for Desired Margin of Error)

To achieve a margin of error \(E\) at confidence level \(100(1-\alpha)\%\):

\[n = \left(\frac{z_{\alpha/2}\,\sigma}{E}\right)^2\]

Round up to the next whole number to ensure the actual margin of error is no more than \(E\).

Guarantees for precision of confidence intervals

One-Sided Confidence Bounds

Sometimes we only care about an upper or lower bound for the parameter, rather than a two-sided interval.

Visualization of upper and lower confidence bounds

Definition (One-Sided Confidence Bound)

A one-sided upper confidence bound for \(\mu\) is an interval \((-\infty, U]\) constructed such that \(P(\mu \leq U) = 1-\alpha\).

A one-sided lower confidence bound for \(\mu\) is an interval \([L, +\infty)\) constructed such that \(P(\mu \geq L) = 1-\alpha\).

Formula (One-Sided Confidence Bounds for \(\mu\) with known \(\sigma\))

Upper :math:`(1-alpha)` confidence bound:

\(\mu \leq \bar{x} + z_{\alpha}\frac{\sigma}{\sqrt{n}}\) with confidence \(1-\alpha\)

Equivalently, \((-\infty,\; \bar{x} + z_{\alpha}\frac{\sigma}{\sqrt{n}}]\) is the one-sided CI.
Lower :math:`(1-alpha)` confidence bound:

\(\mu \geq \bar{x} - z_{\alpha}\frac{\sigma}{\sqrt{n}}\) with confidence \(1-\alpha\)

Equivalently, \([\bar{x} - z_{\alpha}\frac{\sigma}{\sqrt{n}},\; +\infty)\) is the interval.

where \(z_{\alpha}\) is the critical value with area \(\alpha\) in one tail (not \(\alpha/2\)).

For example, for a 95% upper bound, use \(z_{0.05}=1.645\) (not \(z_{0.025}=1.96\) used in two-sided intervals).

Example (One-Sided Upper Bound)

Let’s return to the website load time scenario. Suppose the website’s management specifies that the average load time must not exceed 4 seconds to meet user experience targets. After the update, the team collects a sample of \(n=200\) load times to verify this. They obtain \(\bar{x} = 3.60\) seconds. We will compute a 99% upper confidence bound for \(\mu\) to see if we can be highly confident that \(\mu\) is below 4 seconds.

Here \(\sigma = 1\) (still assumed known from historical data), \(n=200\), and we want a one-sided 99% bound (\(\alpha=0.01\) for the upper tail). The critical value is \(z_{0.01} \approx 2.33\). The upper bound is:

\[U = 3.60 + (2.33)\frac{1.0}{\sqrt{200}} = 3.60 + 2.33 \times 0.0707 \approx 3.60 + 0.165 = 3.765 \text{ seconds.}\]

So our 99% upper confidence bound is about 3.77 seconds. We can say: “With 99% confidence, the true mean load time is no more than 3.77 seconds.”

Because 3.77 s is comfortably below the 4 s target, this provides strong evidence that the site is meeting the performance requirement (remember, 99% confidence is a very high bar — we’re being extremely cautious here).

Note that for the same total area, a one-sided bound is “tighter” on the side of interest than the corresponding end of a two-sided interval, because it focuses all the confidence on one direction.

Confidence Intervals for a Population Mean (σ Unknown)

In most real-world situations, the population standard deviation \(\sigma\) is unknown and must be estimated from the sample. This introduces additional uncertainty.

How to perform inference when sigma is unknown

When \(\sigma\) is unknown, we use the sample standard deviation \(s\) and the pivotal quantity becomes:

\[T = \frac{\bar{X} - \mu}{S/\sqrt{n}}\]

This statistic follows Student’s t-distribution with \(df = n-1\) degrees of freedom (if the population is normal).

William Sealy Gosset and the origin of Student's t-distribution

Properties of the t-distribution

Symmetric and bell-shaped, similar to the normal distribution, but with heavier tails
Characterized by its degrees of freedom (df = n-1 for one-sample mean problems)
As df increases, the t-distribution approaches the normal distribution

Comparison of t-distributions with different degrees of freedom

Formula (Confidence Interval for \(\mu\) with \(\sigma\) unknown)

Under the assumption that the population is approximately normal (or \(n\) is moderately large so that the t-methods are robust), a \(100(1-\alpha)\%\) confidence interval for the population mean \(\mu\) is:

\[\bar{x} \pm t_{\alpha/2,\,n-1}\frac{s}{\sqrt{n}}\]

Or written as a range:

\[\left[\bar{x} - t_{\alpha/2,\,n-1}\frac{s}{\sqrt{n}}, \quad \bar{x} + t_{\alpha/2,\,n-1}\frac{s}{\sqrt{n}}\right]\]

where \(t_{\alpha/2,\,n-1}\) is the critical value from the t-distribution with n-1 degrees of freedom, such that \(P(T \leq t_{\alpha/2,\,n-1}) = 1-\alpha/2\) for \(T \sim t_{(n-1)}\).

This interval is typically wider than the Z-interval for small samples because \(t_{\alpha/2, n-1} > z_{\alpha/2}\) for finite \(n\). The difference diminishes as sample size increases.

Robustness of the t-procedure to violations of assumptions

Example (Confidence Interval with Unknown σ)

Suppose we wish to estimate the average lifetime \(\mu\) of a new type of LED bulb. We test a random sample of \(n=10\) bulbs and record their lifespans (in hours) until burnout. The sample yields \(\bar{x} = 1200\) hours and \(s = 100\) hours. Assuming roughly normal lifetime distributions, let’s construct a 95% CI for \(\mu\).

Here \(n=10\), so df = 9. For 95% confidence, we need \(t_{0.025,9}\). From t tables or software, \(t_{0.025,9} \approx 2.262\). The formula gives:

Standard error: \(s/\sqrt{n} = 100/\sqrt{10} = 31.62\) hours
Critical value: \(t_{0.025,9} \approx 2.262\)
Margin of error: \(E = 2.262 \times 31.62 = 71.53\) hours
Confidence interval: \(1200 \pm 71.53 = [1128.5, 1271.5]\) hours

Interpretation: “We are 95% confident that the true mean lifetime of the new LED bulbs is between about 1128 and 1272 hours.”

Had we incorrectly used the Z-interval (assuming known \(\sigma\)), we would have calculated \(1200 \pm 1.96(31.62) = [1138.0, 1262.0]\) hours. That interval is slightly narrower (about 124 hours wide vs 143 hours wide). The correct t-interval is wider to account for the extra uncertainty from estimating \(\sigma\) with a sample of only 10 observations.

Computing Confidence Intervals in R

# For a confidence interval with unknown sigma:
t.test(x, conf.level = 0.95)

# Sample output:
# 95 percent confidence interval:
#  1128.5 1271.5
# sample estimates:
# mean of x
#      1200

Interactive Confidence Interval Simulation

To help students develop intuition about confidence intervals and visualize what “95% confidence” actually means in practice, I’ve created an interactive Shiny application. This app allows you to simulate confidence intervals for different statistical distributions and see firsthand how the confidence level relates to coverage probability.

The Shiny app provides a hands-on exploration of confidence interval concepts:

Multiple interval methods: - z-intervals with known population standard deviation - z-intervals using the sample standard deviation s (technically incorrect but educational to see) - t-intervals (properly accounting for estimating σ with s)
Various distribution options: - Normal distribution (where all methods are theoretically justified) - Uniform distribution (symmetric but non-normal) - Exponential distribution (right-skewed) - Poisson distribution (discrete)
Adjustable parameters: - Sample size (n): See how interval width changes as n increases - Number of intervals to simulate: Generate multiple samples and CIs at once - Confidence level: Try different values (90%, 95%, 99%, etc.) - Distribution-specific parameters (mean, standard deviation, etc.)
Visual feedback: - Green intervals successfully capture the true mean; red intervals miss it - Histograms show the sampling distribution of the mean - Running counters track coverage percentage both for the current batch and cumulatively

This simulation demonstrates several key insights:

In repeated sampling, approximately C% of confidence intervals will contain the true parameter (where C is the confidence level)
Larger sample sizes produce narrower intervals (higher precision)
Higher confidence levels produce wider intervals (trading precision for confidence)
The t-distribution approach is more conservative for small samples
The Central Limit Theorem in action as sample means approach normality

Students are encouraged to experiment with different settings. For example: - What happens with very small sample sizes from non-normal distributions? - How small can n be before the z-interval becomes unreliable? - How does confidence level affect the frequency of intervals containing μ? - What happens when you use a z-interval (assuming known σ) when you should be using a t-interval?

The app code is available in the course repository for those who wish to examine how the simulations are performed or extend the app with additional features.

Summary

Confidence intervals are powerful tools for quantifying uncertainty in parameter estimates. They combine point estimates with margins of error determined by:

The desired confidence level (higher confidence → wider interval)
The sample variability (higher variability → wider interval)
The sample size (larger sample → narrower interval)

The correct interpretation of a confidence interval relates to the procedure’s long-run performance: if we repeatedly sample and construct intervals using this method, the specified percentage (e.g., 95%) of intervals will contain the true parameter.

Always report a confidence interval along with a point estimate to give readers a sense of the estimate’s reliability. When planning studies, determine the necessary sample size to achieve the desired precision at your target confidence level.: 80%

Confidence Intervals for a Population Mean (σ Known)

A confidence interval (CI) provides a range of plausible values for the true mean \(\mu\) based on sample data, with a confidence level indicating the reliability of that range.

Definition (Confidence Interval)

A confidence interval is a range of values constructed from sample data with a specified probability (the confidence level) that the interval contains the true population parameter.

Deriving the Interval Using a Pivotal Quantity

To construct a confidence interval for \(\mu\), we need the sampling distribution of an appropriate statistic. We start with the case where the population standard deviation \(\sigma\) is known.

The pivotal method for deriving confidence intervals

Assumptions

We have a simple random sample \(X_1, X_2, \ldots, X_n\) from a population with unknown mean \(\mu\) and known standard deviation \(\sigma\).
Either (a) the population is normally distributed, or (b) \(n\) is large enough for the Central Limit Theorem to apply.

Under these assumptions, the sample mean \(\bar{X}\) has a normal sampling distribution:

\(E[\bar{X}] = \mu\) (unbiasedness)
\(\text{Var}(\bar{X}) = \sigma^2/n\), so \(\text{SD}(\bar{X}) = \sigma/\sqrt{n}\) (the standard error of the mean)
\(\bar{X} \sim N(\mu, \sigma^2/n)\) exactly if the population is normal, or approximately by the CLT otherwise

The standardized variable

\[Z = \frac{\bar{X} - \mu}{\sigma/\sqrt{n}}\]

follows a standard normal distribution \(N(0,1)\). This \(Z\) is our pivotal quantity (a function of the data and the unknown parameter with a known distribution).

Definition (Pivotal Quantity)

A pivotal quantity is a function of the sample data and unknown parameter(s) that has a distribution independent of the parameter values. Here, \(Z = \frac{\bar{X}-\mu}{\sigma/\sqrt{n}}\) is pivotal because it follows \(N(0,1)\) regardless of \(\mu\).

For a \(100(1-\alpha)\%\) confidence level, we need to capture the central \(1-\alpha\) probability of the \(Z\) distribution:

\[P\left(-z_{\alpha/2} < \frac{\bar{X} - \mu}{\sigma/\sqrt{n}} < z_{\alpha/2}\right) = 1 - \alpha\]

where \(z_{\alpha/2}\) is the critical value such that \(P(Z \leq z_{\alpha/2}) = 1 - \alpha/2\).

Through algebraic manipulation:

\[\begin{split}\begin{align*} 1-\alpha &= P\left(-z_{\alpha/2} < \frac{\bar{X} - \mu}{\sigma/\sqrt{n}} < z_{\alpha/2}\right) \\ &= P\left(-z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} < \bar{X} - \mu < z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}}\right) \\ &= P\left(\bar{X} - z_{\alpha/2}\frac{\sigma}{\sqrt{n}} < \mu < \bar{X} + z_{\alpha/2}\frac{\sigma}{\sqrt{n}}\right) \end{align*}\end{split}\]

This gives us the basis for the confidence interval. Once we observe our sample and compute \(\bar{x}\), we have:

Formula (Two-Sided Confidence Interval for \(\mu\) with known \(\sigma\))

A \(100(1-\alpha)\%\) confidence interval for the population mean \(\mu\) is:

\[\bar{x} \pm z_{\alpha/2}\frac{\sigma}{\sqrt{n}}\]

Or written as a range:

\[\left[\bar{x} - z_{\alpha/2}\frac{\sigma}{\sqrt{n}}, \quad \bar{x} + z_{\alpha/2}\frac{\sigma}{\sqrt{n}}\right]\]

Common critical values:

For 90% CI: \(\alpha=0.10\) and \(z_{0.05} \approx 1.645\)
For 95% CI: \(\alpha=0.05\) and \(z_{0.025} \approx 1.96\)
For 99% CI: \(\alpha=0.01\) and \(z_{0.005} \approx 2.576\)

Visualization of 95% confidence intervals

The correct interpretation of a “95% confidence interval” is not that “\(\mu\) has a 95% probability of lying in this particular interval” — \(\mu\) is fixed, not random. Rather, it means if we were to take many samples and build a 95% CI from each sample, about 95% of those intervals would contain the true \(\mu\).

Example (Estimating a Mean with a Known σ)

Let’s illustrate with a practical example. Suppose an engineering team is investigating the average load time \(\mu\) of their website’s homepage. Historically, the standard deviation of load times is known to be \(\sigma = 1.0\) second (perhaps from a large amount of past data). The team deploys a new update to the site and now wants to estimate the new average load time.

(a) Planning for a Desired Precision: Before collecting the data, the team decides they want a margin of error no more than 0.2 seconds for a 95% confidence interval. How large a sample do they need?

This is a sample size determination problem. The margin of error (half the width of the CI) is \(E = z_{\alpha/2}\frac{\sigma}{\sqrt{n}}\). For 95% confidence, \(z_{0.025}=1.96\). We want \(E \leq 0.2\). So we solve:

\[1.96 \frac{\sigma}{\sqrt{n}} \leq 0.2\]

Plugging in \(\sigma = 1\):

\[1.96 \frac{1}{\sqrt{n}} \leq 0.2 \quad\implies\quad \frac{1.96}{0.2} \leq \sqrt{n} \quad\implies\quad \sqrt{n} \geq 9.8 \quad\implies\quad n \geq 96.04\]

So \(n\) needs to be at least 96.04. Since \(n\) must be an integer, we round up to 97 observations.

(b) Constructing the 95% CI: The team collects \(n=97\) page load times and finds \(\bar{x} = 3.45\) seconds. Now they construct the 95% confidence interval:

Standard error: \(\sigma/\sqrt{n} = 1/\sqrt{97} \approx 0.1015\) seconds
Critical value: \(z_{0.025} = 1.96\)
Margin of error: \(E = 1.96 \times 0.1015 \approx 0.199\) seconds
Confidence interval: \(3.45 \pm 0.199 = [3.25, 3.65]\) seconds

Interpretation: “We are 95% confident that the true average load time after the update is between 3.25 and 3.65 seconds.”

(c) Confidence Level and Interval Width: If the team wanted 99% confidence instead of 95%, the interval would widen to \(3.45 \pm 2.576(0.1015) = [3.19, 3.71]\) seconds. Higher confidence requires a wider interval.

Example of confidence interval calculation

Second part of the American adult male weights example

Sample Size Planning

When designing a study, we often need to determine how many observations to collect to achieve a desired precision.

Formula (Required Sample Size for Desired Margin of Error)

To achieve a margin of error \(E\) at confidence level \(100(1-\alpha)\%\):

\[n = \left(\frac{z_{\alpha/2}\,\sigma}{E}\right)^2\]

Round up to the next whole number to ensure the actual margin of error is no more than \(E\).

Example (Planning Sample Size)

For the website load time study, how many samples are needed for a margin of error of 0.2 seconds with 95% confidence?

\[n = \left(\frac{1.96 \times 1.0}{0.2}\right)^2 = 96.04\]

Therefore, at least 97 observations are needed.

General procedure for confidence interval construction

One-Sided Confidence Bounds

Sometimes we only care about an upper or lower bound for the parameter, rather than a two-sided interval.

Definition (One-Sided Confidence Bound)

A one-sided upper confidence bound for \(\mu\) is an interval \((-\infty, U]\) constructed such that \(P(\mu \leq U) = 1-\alpha\).

A one-sided lower confidence bound for \(\mu\) is an interval \([L, +\infty)\) constructed such that \(P(\mu \geq L) = 1-\alpha\).

Procedures for confidence bounds with known sigma

Formula (One-Sided Confidence Bounds for \(\mu\) with known \(\sigma\))

Upper :math:`(1-alpha)` confidence bound:

\(\mu \leq \bar{x} + z_{\alpha}\frac{\sigma}{\sqrt{n}}\) with confidence \(1-\alpha\)

Equivalently, \((-\infty,\; \bar{x} + z_{\alpha}\frac{\sigma}{\sqrt{n}}]\) is the one-sided CI.
Lower :math:`(1-alpha)` confidence bound:

\(\mu \geq \bar{x} - z_{\alpha}\frac{\sigma}{\sqrt{n}}\) with confidence \(1-\alpha\)

Equivalently, \([\bar{x} - z_{\alpha}\frac{\sigma}{\sqrt{n}},\; +\infty)\) is the interval.

where \(z_{\alpha}\) is the critical value with area \(\alpha\) in one tail (not \(\alpha/2\)).

For example, for a 95% upper bound, use \(z_{0.05}=1.645\) (not \(z_{0.025}=1.96\) used in two-sided intervals).

Example (One-Sided Upper Bound)

Let’s return to the website load time scenario. Suppose the website’s management specifies that the average load time must not exceed 4 seconds to meet user experience targets. After the update, the team collects a sample of \(n=200\) load times to verify this. They obtain \(\bar{x} = 3.60\) seconds. We will compute a 99% upper confidence bound for \(\mu\) to see if we can be highly confident that \(\mu\) is below 4 seconds.

Here \(\sigma = 1\) (still assumed known from historical data), \(n=200\), and we want a one-sided 99% bound (\(\alpha=0.01\) for the upper tail). The critical value is \(z_{0.01} \approx 2.33\). The upper bound is:

\[U = 3.60 + (2.33)\frac{1.0}{\sqrt{200}} = 3.60 + 2.33 \times 0.0707 \approx 3.60 + 0.165 = 3.765 \text{ seconds.}\]

So our 99% upper confidence bound is about 3.77 seconds. We can say: “With 99% confidence, the true mean load time is no more than 3.77 seconds.”

Because 3.77 s is comfortably below the 4 s target, this provides strong evidence that the site is meeting the performance requirement (remember, 99% confidence is a very high bar — we’re being extremely cautious here).

Note that for the same total area, a one-sided bound is “tighter” on the side of interest than the corresponding end of a two-sided interval, because it focuses all the confidence on one direction.

Cautions when applying confidence intervals

Confidence Intervals for a Population Mean (σ Unknown)

In most real-world situations, the population standard deviation \(\sigma\) is unknown and must be estimated from the sample. This introduces additional uncertainty.

When \(\sigma\) is unknown, we use the sample standard deviation \(s\) and the pivotal quantity becomes:

\[T = \frac{\bar{X} - \mu}{S/\sqrt{n}}\]

This statistic follows Student’s t-distribution with \(df = n-1\) degrees of freedom (if the population is normal).

Properties of the t-distribution

Symmetric and bell-shaped, similar to the normal distribution, but with heavier tails
Characterized by its degrees of freedom (df = n-1 for one-sample mean problems)
As df increases, the t-distribution approaches the normal distribution

Formula (Confidence Interval for \(\mu\) with \(\sigma\) unknown)

Under the assumption that the population is approximately normal (or \(n\) is moderately large so that the t-methods are robust), a \(100(1-\alpha)\%\) confidence interval for the population mean \(\mu\) is:

\[\bar{x} \pm t_{\alpha/2,\,n-1}\frac{s}{\sqrt{n}}\]

Or written as a range:

\[\left[\bar{x} - t_{\alpha/2,\,n-1}\frac{s}{\sqrt{n}}, \quad \bar{x} + t_{\alpha/2,\,n-1}\frac{s}{\sqrt{n}}\right]\]

where \(t_{\alpha/2,\,n-1}\) is the critical value from the t-distribution with n-1 degrees of freedom, such that \(P(T \leq t_{\alpha/2,\,n-1}) = 1-\alpha/2\) for \(T \sim t_{(n-1)}\).

Procedures for confidence intervals and bounds with unknown sigma

This interval is typically wider than the Z-interval for small samples because \(t_{\alpha/2, n-1} > z_{\alpha/2}\) for finite \(n\). The difference diminishes as sample size increases.

Example (Confidence Interval with Unknown σ)

Suppose we wish to estimate the average lifetime \(\mu\) of a new type of LED bulb. We test a random sample of \(n=10\) bulbs and record their lifespans (in hours) until burnout. The sample yields \(\bar{x} = 1200\) hours and \(s = 100\) hours. Assuming roughly normal lifetime distributions, let’s construct a 95% CI for \(\mu\).

Here \(n=10\), so df = 9. For 95% confidence, we need \(t_{0.025,9}\). From t tables or software, \(t_{0.025,9} \approx 2.262\). The formula gives:

Standard error: \(s/\sqrt{n} = 100/\sqrt{10} = 31.62\) hours
Critical value: \(t_{0.025,9} \approx 2.262\)
Margin of error: \(E = 2.262 \times 31.62 = 71.53\) hours
Confidence interval: \(1200 \pm 71.53 = [1128.5, 1271.5]\) hours

Interpretation: “We are 95% confident that the true mean lifetime of the new LED bulbs is between about 1128 and 1272 hours.”

Had we incorrectly used the Z-interval (assuming known \(\sigma\)), we would have calculated \(1200 \pm 1.96(31.62) = [1138.0, 1262.0]\) hours. That interval is slightly narrower (about 124 hours wide vs 143 hours wide). The correct t-interval is wider to account for the extra uncertainty from estimating \(\sigma\) with a sample of only 10 observations.

Computing Confidence Intervals in R

# For a confidence interval with unknown sigma:
t.test(x, conf.level = 0.95)

# Sample output:
# 95 percent confidence interval:
#  1128.5 1271.5
# sample estimates:
# mean of x
#      1200

Summary

Confidence intervals are powerful tools for quantifying uncertainty in parameter estimates. They combine point estimates with margins of error determined by:

The desired confidence level (higher confidence → wider interval)
The sample variability (higher variability → wider interval)
The sample size (larger sample → narrower interval)

The correct interpretation of a confidence interval relates to the procedure’s long-run performance: if we repeatedly sample and construct intervals using this method, the specified percentage (e.g., 95%) of intervals will contain the true parameter.

Always report a confidence interval along with a point estimate to give readers a sense of the estimate’s reliability. When planning studies, determine the necessary sample size to achieve the desired precision at your target confidence level.

Common Misinterpretations of Confidence Intervals

When working with confidence intervals, several misinterpretations frequently arise. Being aware of these can help avoid conceptual errors in statistical reasoning:

Misinterpretation 1: The Probability Interpretation

Incorrect: “There is a 95% probability that the true mean μ lies within this confidence interval.”

Why it’s wrong: This statement incorrectly treats the parameter μ as a random variable. In the frequentist framework, μ is fixed (though unknown) - it either is or isn’t in the interval, with no probability involved. The confidence level refers to the procedure, not to any specific interval.

Correct: “This interval comes from a method that produces intervals containing the true mean 95% of the time in repeated sampling.”

Misinterpretation 2: The Precision Fallacy

Incorrect: “A 95% confidence interval means we are 95% certain that the true parameter falls within the given range.”

Why it’s wrong: Confidence is not the same as certainty or probability. This interpretation confuses frequentist confidence with Bayesian credibility.

Correct: “If we constructed many such intervals using this same procedure, about 95% of them would contain the true parameter.”

Misinterpretation 3: Interpreting Width as Likelihood

Incorrect: “Values near the center of the confidence interval are more likely to be the true parameter value than values near the edges.”

Why it’s wrong: Standard confidence intervals don’t provide information about which values within the interval are more likely. All values inside the interval are treated equally in the interpretation.

Correct: “The interval represents a range of plausible values for the parameter, based on our sample data, without assigning higher likelihood to any particular value within that range.”

Misinterpretation 4: The Sample-to-Sample Belief

Incorrect: “95% of all possible samples will yield confidence intervals containing the true parameter.”

Why it’s wrong: While related to the correct interpretation, this statement fails to recognize that we’re discussing the properties of the interval-construction procedure, not the samples themselves.

Correct: “If we took many different samples and constructed a 95% confidence interval from each one, approximately 95% of these intervals would contain the true parameter.”

Understanding what confidence inter vals actually tell us is essential for proper statistical inference. They provide valuable information about parameter estimates, but we must be careful not to claim more than what they actually deliver. When wanting to make probability statements about parameters themselves, Bayesian credible intervals (which do allow for such interpretations) may be more appropriate.