Slides 📊

9.2. Confidence Intervals for the Population Mean, When σ is Known

Confidence intervals are one of the most fundamental tools for expressing uncertainty around a point estimate. In this section, we focus on constructing the confidence interval for a population mean \(\mu\) when the population standard deviation \(\sigma\) is known. While this scenario is somewhat rare in practice (as \(\sigma\) is typically unknown), it provides a clear foundation for understanding the core logic of interval estimation.

Road Map 🧭

Understand the complementary roles of a point estimator and an interval estimator.
Derive the confidence interval for a population mean \(\mu\), given that the population standard deviation \(\sigma\) is known. Apply the derived formula to problems.
Understand the difference between a confidence interval as an estimator and and a realized estimate, and correctly interpret each.

9.2.1. Why Point Estimates Aren’t Enough

In practice, we usually have access to only one sample from the population. When we use a point estimator \(\hat{\theta}\) to estimate a parameter \(\theta\), we obtain only one realization out of many potential estiamtes that could have been observed. From this alone, we do not know how precise our current estimate is.

To address this limitation, we turn to an interval estimator, which relies on the sampling distribution of \(\hat{\theta}\) to measure and express uncertainty.

Interval Estimators

An interval estimator of a parameter \(\theta\) aims to provide a range of possible values for its true location. It is typically constructed by expanding left and right from a point estimator. For a point estimator \(\hat{\theta}\) whose sampling distirbution is symmetric, the expansion is also symmetric. That is, it takes the following general form:

\[(\hat{\theta} - ME, \hat{\theta} + ME),\]

where ME stands for the margin of error. The margin of error expresses the magnitude of expansion from the reference point, and therefore is always non-negative. Its value is determined by a combination of several components, such as

The population distribution,
The definition of the point estimator \(\hat{\theta}\), and
The sample size \(n\).

9.2.2. Confidence Interval for \(\mu\), When \(\sigma\) is Known

We now construct a specific class of interval estimators for the population mean \(\mu\), based on its most widely used point estimator \(\bar{X}\). This class is called Confidence Intervals (CIs).

The Goal

For a pre-specified value \(C\) between 0 and 1, a confidence interval for \(\mu\) aims to capture \(\mu\) with probability \(C\). Mathematically, this is equivalent to finding the ME (margin of error) which satisfies:

\[P\left(\bar{X}-ME < \mu < \bar{X} + ME \right) = C.\]

Preliminaries

Before diving into the main derivation, let us first touch on its key ingredients.

The probability \(C\) is called the confidence coefficient of a confidence interval.
When expressed in the percentage scale, \(C \cdot 100\%\) is called the confidence level of a confidence interval.
We denote the complement of \(C\) as \(\alpha\). That is,

\[\alpha = 1 - C.\]
Another important ingredient is the \(z\)-critical value. A \(z\)-critical value \(z_{\alpha/2}\) is the value on a standard normal pdf which marks an upper area of \(\alpha/2\).

By the symmetry of the standard normal distribution around 0, the region below \(-z_{\alpha/2}\) also has the area \(\alpha/2\). Note that this leaves the central region bewteen \(-z_{\alpha/2}\) and \(z_{\alpha/2}\) exactly the area of \(C\).

Deriving the Confidence Interval: The Pivotal Method

This derivation is valid under the following assumptions:

\(X_1, X_2, \ldots, X_n\) form an iid sample from the population, \(X\). The expected value and variance of \(X\) is denoted \(\mu\) and \(\sigma^2\), respectively.
Either the population is normally distributed, or we have sufficiently large \(n\) for the CLT to hold.
The population variance \(\sigma^2\) is known.

From the first and second assumptions, we have:

\[\bar{X} \sim N\left(\mu, \frac{\sigma^2}{n}\right),\]

which is equivalent to stating that its standardization has a standard normal distribution:

\[\frac{\bar{X} - \mu}{\sigma/\sqrt{n}} \sim N(0, 1).\]

The standardization of \(\bar{X}\) is also called the pivotal quantity. We are now ready to take the final steps.

Step 1: Find the Interval Under the Standard Normal Distribution

From the preliminaries, we already know that

\[P\left(-z_{\alpha/2} < Z < z_{\alpha/2}\right) = C.\]

Step 2: Replace \(Z\) with the Pivotal Quantity

\[P\left(-z_{\alpha/2} < \frac{\bar{X} - \mu}{\sigma/\sqrt{n}} < z_{\alpha/2}\right) = C.\]

This is possible because the pivotal quantity has the same distribution as \(Z\).

Step 3: Rearrange the Inequalities

Multiply all terms by \(\frac{\sigma}{\sqrt{n}}\):

\[P\left(-z_{\alpha/2} \frac{\sigma}{\sqrt{n}} < \bar{X} - \mu < z_{\alpha/2} \frac{\sigma}{\sqrt{n}}\right) = C\]
Multiply by -1 and reverse the inequalities:

\[P\left(z_{\alpha/2} \frac{\sigma}{\sqrt{n}} > \mu - \bar{X} > -z_{\alpha/2} \frac{\sigma}{\sqrt{n}}\right) = C\]
Rearrange to isolate \(\mu\):

\[P\left(\bar{X} - z_{\alpha/2} \frac{\sigma}{\sqrt{n}} < \mu < \bar{X} + z_{\alpha/2} \frac{\sigma}{\sqrt{n}}\right) = C\]

At this point, the probability statement looks exactly like our goal, with a computable representation of the ME.

Summary

From the result above, we discover that the margin of error for the \(C\cdot 100 \%\) confidence interval of \(\mu\) is

\[ME = z_{\alpha/2}\frac{\sigma}{\sqrt{n}}.\]

Plugging this into the general form, we finally obtain the complete expression for the confidence interval:

\[\left(\bar{X} - z_{\alpha/2} \frac{\sigma}{\sqrt{n}}, \bar{X} + z_{\alpha/2} \frac{\sigma}{\sqrt{n}}\right).\]

Alternatively, the CI is also written as:

\[\bar{X} \pm z_{\alpha/2} \frac{\sigma}{\sqrt{n}}.\]

Confidence Interval Is Also a Random Variable‼️

Note that a confidence interval will also vary from sample to sample, since its center, \(\bar{X}\), does. This means that we can view a CI as a (bivariate) random variable.

In practice, we have one realization of a CI based on a single sample:

\[\left(\bar{x} - z_{\alpha/2} \frac{\sigma}{\sqrt{n}}, \bar{x} + z_{\alpha/2} \frac{\sigma}{\sqrt{n}}\right).\]

Note that the center is now represented with the lower case \(\bar{x}\). Unlike a point estimate, however, a realized CI still gives us a measure of precision through its width. A narrow interval suggests high precision, while a wide interval reflects greater uncertainty.

New Terminology: Standard Error

To distinguish from the population standard deviation \(\sigma\), we often call the true standard deviation of an estimator the standard error.

In our current setting, the standard error is \(\sigma_{\bar{X}} = \sigma/\sqrt{n}\). However, the mathematical definition of a standard error changes by the inference problem and its appropriate point estimator. We will encounter a few different standard errors in the upcoming chapters.

Example 💡: American Adult Male Weights

Historical data from 1960 indicated that the weight of American adult males was normally distributed with a mean of \(\mu = 166.3\) lbs and a standard deviation of \(\sigma = 49.26\) lbs.

In 2000, researchers collected a new sample of \(n = 3,791\) adult males and found a sample mean of \(\bar{x} = 191\) lbs. Assuming the standard deviation hasn’t changed, compute a 99% confidence interval for the mean weight of American adult males in 2000.

The building blocks

Sample mean: \(\bar{x} = 191\) lbs
Population standard deviation: \(\sigma = 49.26\) lbs
Sample size: \(n = 3,791\)
Confidence level: \(99\% \, (C = 0.99, \alpha = 0.01)\)

Find the critical value

The critical value: \(z_{\alpha/2} = z_{0.005}\) is equal to the 99.5th percentile of a standard normal distribution. You can use the R command:

qnorm(0.005, lower.tail=FALSE)

We find that \(z_{0.005} = 2.5758\).

Find the margin of error

\[\text{ME} = z_{0.005} \frac{\sigma}{\sqrt{n}} = 2.5758 \frac{49.26}{\sqrt{3791}} \approx 2.06 \text{ lbs}\]

Construct the confidence interval

\[\bar{x} \pm \text{ME} = 191 \pm 2.06 = [188.94 \text{ lbs}, 193.06 \text{ lbs}]\]

9.2.3. Interpreting Confidence Intervals Correctly

Suppose a 95% confidence interval of \((121.4, 126.2)\) is computed for the true mean crop yield of corn in a certain state, in bushels per acre. Then, it is incorrect to interpret the numbers as the following:

❌ “With 0.95 probability, the true mean yield of corn is between 121.4 and 126.2 bushels per acre.” ❌

The statement is incorrect for two main reasons, which we now analyze one by one.

Common Pitfall 1: Probabilities of a Confidence Interval

The confidence interval was constructed so that its probability of including the true mean is 0.95. However, this property is true for the CI as an estimator, which is a random variable, NOT for a set of realized values. Once the upper and lower values are determined as \((121.4, 126.2)\), they do not have any probabilistic relationship with another fixed quantity, \(\mu\). Therefore, the expression “With 0.95 probability…” is inaccurate.

Common Pitfall 2: Interval Is Random, \(\mu\) Is Not

Since a realized confidence interval is observed while the population mean \(\mu\) is not, it is easy to mistakenly imply that \(\mu\) varies from sample to sample, rather than the interval itself. The incorrect statement above implies that the true mean yield of corn either succeeds or fails to be between the two fixed numbers, which is not accurate.

We should always remember that \(\mu\) is an unknown yet unchanging property of the population distribution, whereas the confidence interval changes from sample to sample. Our interpretations should reflect this distinction as accurately as possible.

Multiple confidence intervals showing which ones capture the true mean — Fig. 9.3 \(\mu\) stays fixed while the confidence interval changes over different samples.

How To Say It Better

To include the confidence level as part of the interpretation, we use the term “confidence.” We say

✅ With 95% confidence, the interval \((121.4, 126.2)\) captures the true mean yield of corn. ✅

Confidence here is not used in its everyday sense, but as a technical term. It means that \((121.4, 126.2)\) is one realization of a random variable which, across many samples, successfully captures the true mean with probability 0.95.

It is also possible to interpret \(C\) as a probability, provided that the discussion is focused on the confidence interval as a random variable. It is correct to say

✅ Over many computations of the CI using different samples, the probability that it captures \(\mu\) is approximately 0.95. ✅

Here, 0.95 indicates the proportion of “successful” CIs out of the complete collection of a very large number of computed CIs.

Confidence Intervals Simulation 🎮

Explore how confidence intervals behave by generating images like Fig. 9.3 under different population distributions, sample sizes, and confidence level.

🔗 Launch Interactive Demo | 📄 View R Code

Templates for Valid Interpretations

You are encouraged to use the templates below for interpretations of confidence intervals. Simply replace the parentheses with specific values/contexts from your experiment.

Templates for Interpreting a CI
Case	Template
When interpreting a concrete outcome of an experiment (an interval estimate)	With (\(C \cdot 100\))% confidence, the interval (lower value, upper value) captures the true/population mean of (context).
When interpreting a CI as an interval estimator	Over many repeated trials of (experiment context), the proportion of realized confidence intervals which capture the true mean of (context) will be about (\(C\)).

Example 💡: American Adult Male Weights, Continued

In the previous example, we computed a 99% confidence interval for the true mean weight of American adult males as \([188.94, 193.06]\). Give a valid interpretation of this outcome.

Interpretation: We are 99% confident that the interval between 188.94 and 193.06 pounds includes the true mean weight of American adult males in 2000.

9.2.4. Bringing It All Together

Key Takeaways 📝

An interval estimator provides a range of plausible values for a population parameter.
For a population mean with known standard deviation, its confidence interval is defined as \(\bar{X} \pm z_{\alpha/2} \frac{\sigma}{\sqrt{n}}\).
A confidence interval from a single experiment should be interpreted with care. Once a set of realized values, its interpretation should not directly involve any probabilities. Use the term confidence to imply its generation procedure.

Exercises

Compute and Interpret: A researcher measures the heights of 36 adult men and finds a sample mean of 175.4 cm. Assuming the population standard deviation is known to be 7.2 cm, construct a 95% confidence interval for the mean height. Interpret the interval.