Slides 📊
9.5. Confidence Intervals and Bounds When σ is Unknown
So far, we developed confidence regions under the simplifying but unrealistic assumption that the population standard deviation \(\sigma\) is known. In practice, we rarely know \(\sigma\) and must estimate it.
This creates a fundamental challenge. Using a sample standard deviation \(S\) in place of the unknown \(\sigma\) introduces additional uncertainty that must be accounted for. The standard normal distribution is no longer appropriate because it does not capture this extra layer of uncertainty.
The solution to this problem comes from a distribution developed by William Sealy Gosset in the early 1900s: the Student’s t distribution.
Road Map 🧭
Recognize that in most practical scenarios, \(\sigma\) is unknown and must be estimated by \(S\).
Understand that when \(S\) replaces \(\sigma\), the new pivotal quantity follows a t-distribution.
Derive confidence intervals and bounds based on the new t-distribution.
Understand the basic properties of t-distributions.
Learn what it means for a statistical procedure to be robust. Recognize the requirements for t-based procedures to be robust.
9.5.1. William Gosset and the Birth of Student’s t-Distribution
Fig. 9.8 William S. Gosset (1876-1937)
In 1908, William Sealy Gosset, a chemist and statistician employed by the Guinness brewery in Dublin, Ireland, published a paper titled “The Probable Error of a Mean” in the journal Biometrika. Due to Guinness company policy that prohibited employees from publishing their research, Gosset published under the pseudonym “Student”—leading to the now-famous Student’s t-distribution.
Gosset’s work at Guinness involved quality control for beer production. He needed statistical methods that worked reliably with small samples, as testing large quantities of beer would have been wasteful. Specifically, he faced the challenge of making inferences about a population mean when the population standard deviation was unknown and had to be estimated from the same limited sample.
His mathematical solution—the t-distribution—accounts for the added uncertainty of estimating \(\sigma\) with \(S\). This breakthrough has become one of the most widely used statistical tools across virtually all fields of scientific inquiry.
9.5.2. The t-Statistic and Its Distribution
To construct confidence regions, we have so far relied on the fact that the pivotal quantity \(\frac{\bar{X}-\mu}{\sigma/\sqrt{n}}\) follows a standard normal distribution under certain assumptions. When the unknown \(\sigma\) is replaced by its estimator \(S\), however, the resulting statistic
no longer follows a standard normal distribution. Instead, it follows a t-distribution.
The t-distribution is a family of continuous distributions parameterized by \(\nu\) (Greek letter “nu”; also called the degrees of freedom or df). A t-statistic constructed using a sample of size \(n\) has \(\nu = n-1\). The subscript in \(T_{n-1}\) reflects this fact, although it is often ommitted when the context makes it clear or when the detail is unnecessary.
Standardization, Studentization, and Pivotal Quantity
So far, we have called the transformation of a general random variable \(X\) into \(\frac{X-\mu_X}{\sigma_X}\) the standardization of \(X\). When the sample standard deviation \(S_X\) is used instead of \(\sigma_X\), giving
we call this the studentization of \(X\).
Both transformations are variants of pivotal quantities, which are functions of \(X\) constructed so that their distributions do not depend on the unkown parameters of \(X\).
Properties of t-Distributions
Fig. 9.9 t-densities with various degrees of freedom; the curve corresponding to \(+\infty\) is the standard normal PDF.
A t-distribution is symmetric around zero, similar to the standard normal distribution.
It has heavier tails than the standard normal distribution, reflecting the additional uncertainty from estimating \(\sigma\). This means that a t-distribution is always more spread out than the standard normal distribution for any finite degrees of freedom.
The smaller the sample size, the heavier the tails. The distribution approaches the standard normal distribution as the degrees of freedom increase.
The PDF of a t-distribution
The probability density function of a t-distribution is given by:
Where \(\Gamma\), the gamma function, is a generalization of the factorial function. Just like normal distributions, we rely on tables or software to compute probabilities and percentiles involving t-distributions.
9.5.3. Deriving t-Based Confidence Regions
Preliminaries and Assumptions
The derivation of t-based confidence intervals requires a similar set of assumptions as before. The only difference is that \(\sigma\) is now unknown.
The data \(X_1, X_2, \cdots, X_n\) must be an iid sample from a population with mean \(\mu\) and variance \(\sigma^2.\)
Either the population is normally distributed, or we have sufficiently large \(n\) for the CLT to hold.
Both \(\mu\) and \(\sigma\) are unknown.
We also need to define the t-critical values. A t-critical value, denoted \(t_{\alpha/2, \nu}\), is the point on the t-distribution with \(\nu\) degrees of freedom such that its upper-tail area equals \(\alpha/2\). The notation includes an additional subscript for the degrees of freedom, since its location also depends on the specific t-distribution on which it is defined.
Derivation of the Confidence Interval
Similar to the case with known \(\sigma\), we derive a confidence interval for \(\mu\) using the pivotal method. For the degrees of freedom \(n-1\), the following statement is true by the definition of \(t_{\alpha/2, n-1}\):
Replace \(T_{n-1}\) with the new pivotal quantity:
Through algebraic pivoting, we isolate \(\mu\) to obtain:
Therefore, the \(C\cdot100\%\) confidence interval is:
Summary of t-Based Confidence Intervals and Bounds
We leave it to the reader to work out the details of deriving the upper and lower confidence bounds under a t-distribution. The results follow the same pattern as their \(z\) equivalents; the margin of error will be computed with a smaller critical value \(t_{\alpha, n-1}\) instead of \(t_{\alpha/2, n-1}\).
In summary, when we have \(\bar{x}\) and \(s\) from an observed sample, we use the following formulas to compute confidence regions.
Confidence Regions When \(\sigma\) Is Unkonwn |
|
|---|---|
Confidence Interval |
\[\bar{x} \pm t_{\alpha/2, n-1} \frac{s}{\sqrt{n}}\]
|
Lower Confidence Bound |
\[\bar{x} - t_{\alpha, n-1} \frac{s}{\sqrt{n}}\]
|
Upper Confidence Bound |
\[\bar{x} + t_{\alpha, n-1} \frac{s}{\sqrt{n}}\]
|
Example 💡: Cholesterol Reduction Study
A pharmaceutical company is testing a new drug designed to lower LDL cholesterol levels. In a clinical trial, 15 patients with high cholesterol received the drug for eight weeks, and the reduction in their LDL cholesterol (in mg/dL) was measured.
The sample mean reduction was \(\bar{x} = 23.4\) mg/dL with a sample standard deviation of \(s = 6.8\) mg/dL. Construct a 95% confidence interval for the true mean reduction \(\mu\).
Step 1: Identify the key information
Sample size: \(n = 15\)
Sample mean: \(\bar{x} = 23.4\) mg/dL
Sample standard deviation: \(s = 6.8\) mg/dL
Confidence level: \(95\%\) (\(\alpha = 0.05\))
Degrees of freedom: \(\nu = n - 1 = 14\)
Step 2: Find the critical value
qt(0.025, df = 14, lower.tail=FALSE) # Returns 2.145
Step 3: Calculate the margin of error
Step 4: Construct the confidence interval
Interpretation: We are 95% confident that the true mean reduction in LDL cholesterol with this drug is captured by the region between 19.64 and 27.16 mg/dL.
9.5.4. The Effect of Sample Size on t-Confidence Regions
As with \(z\)-confidence regions, a large \(n\) makes \(t\)-confidence regions more precise in general. However, the ways in which \(n\) influences this phenomenon are more multifaceted for t-based methods:
A larger \(n\) reduces the true standard error, \(\sigma/n\).
Although the true standard error is unknown, its estimator \(S/n\) targets it more accurately with larger \(n\).
The critical value itself decreases as \(n\) increases, which further narrows the confidence region.
To see how the third point holds, see Fig. 9.10 below:
Fig. 9.10 Two-sided t-critical values for \(\alpha = 0.05\) with different degrees of freedom
In Fig. 9.10, the upper tails of two t-distributions are compared: one with \(df=n-1=99\) and the other with \(df=n-1=9\). Recall that higher degrees of freedom (and larger sample size) are associated with a lighter tail on a t-distribution. As \(n\) grows from 10 to 100, the difference in tail weight causes the critical value to move closer to the center (zero) in order to maintain an area of \(\alpha/2\) on its right.
In general, for the same confidence level and any two sample sizes \(n_1 < n_2\), it always holds that
A smaller critical value leads to a smaller margin of error if \(s\) is held constant, which in turn results in a more precise (narrower) confidence region. This relationship does not hold strictly in practice since \(S\) fluctuates with data, but the overall tendency remains.
Comparison with \(z\)-Confidence Regions
The above result implies that the \(z\)-confidence regions— which can loosely be considered t-confidence regions with an “infinite” dataset for sample variance computation—are more precise on average than their t-based parallels.
When is a \(t\)-Confidence Region Appropriate?
In practice, the true variance \(\sigma^2\) is rarely known, leaving t-procedures as our only option. On the rare occasions when \(\sigma^2\) is known, it is always preferabale to use this true information.
9.5.5. Sample Size Planning When σ Is Unknown
Sample size planning is more challenging when \(\sigma\) is unknown. Suppose we want \(n\) such that
for some given maximum margin of error, \(ME_{max}\). By taking similar steps as in Chapter 9.3.2, we have
We now see that the problem is circular. To determine \(n\), we need the \(t\)-critical value, which depends on \(n\). Additionally, we need a value for \(s\), which we don’t have before collecting data. To address this issue, we use an iterative approach involving the following steps:
Obtain a planning value \(s_*\). This can be done by
Using \(s\) from a pilot study or previous research,
Making an educated guess based on the expected range, or
Using a conservative upper bound when uncertainty is high.
Update \(n\) iteratively.
Start with an initial guess using the \(z\)-critical value: \(n_0 = \left(\frac{z_{\alpha/2} s_*}{ME_{max}}\right)^2\).
Calculate the t-critical value using \(df = n_0 - 1\).
Recalculate \(n\) using the t-critical value.
Repeat until convergence.
This process typically converges quickly, often in just a few iterations.
9.5.6. Robustness of the \(t\)-Procedures
A statistical procedure is considered robust if it performs reasonably well even when its assumptions are somewhat violated. The \(t\)-procedures show good robustness against moderate departures from normality, especially as sample size increases.
Guidelines for Using t-Procedures When Normality May Not Hold |
|
|---|---|
\(n < 15\) |
The population distribution should be approximately normal. Check the sample data with normal probability plots. |
\(15 ≤ n < 40\) |
A \(t\)-procedure works well with some mild skewness. Avoid using with strongly skewed data or data containing outliers. |
\(n ≥ 40\) |
A \(t\)-procedure is generally reliable even with moderately skewed distributions, thanks to the Central Limit Theorem. |
Regardless of sample size, the procedure is sensitive to outliers, which can strongly influence both \(\bar{x}\) and \(s\). Always inspect your data for outliers before applying a \(t\)-procedure.
9.5.7. Bringing It All Together
Key Takeaways 📝
The Student’s t-distribution provides the appropriate framework for quantifying uncertainty about a population mean when the population standard deviation is unknown.
The pivotal quantity \(T_{n-1}= \frac{\bar{X}-\mu}{S/\sqrt{n}}\) now follows a t-distribution with \(n-1\) degrees of freedom.
The resulting confidence regions account for the additional uncertainty in estimating \(\sigma\), and are wider than their \(z\) parallels on average.
The t-procedures are robust to moderate violations of the normality assumption. The robustness grows with the sample size.
Exercises
A quality control engineer wants to estimate the mean tensile strength of steel cables. A sample of 25 cables yields a mean strength of 3450 N with a standard deviation of 120 N. Construct a 99% confidence interval for the mean strength.
A pilot study with 8 observations yielded a sample standard deviation of \(s = 15\). If a researcher wants to estimate the population mean with a margin of error of no more than 5 units at 95% confidence, how many observations should be planned for the full study?