Slides 📊
9.4. Confidence Bounds for the Poulation Mean When σ is Known
Confidence intervals provide two boundaries that define a plausible range for the true parameter. When the practical interest lies primarily in one direction, however, one-sided confidence bounds offer a more context-appropriate and precise approach. Their logic and derivation follow the same principles as those of two-sided intervals, but they differ in how the probability of error is allocated.
Road Map 🧭
Recognize the situations in which using a one-sided confidence bound may be more precise and cost-efficient than a confidence interval.
Derive a confidence bound using principles parallel to those used in constructing confidence intervals, and apply the result to practice.
9.4.1. Why One‑Sided Bounds?
Two-sided confidence intervals aim to capture the unknown population mean \(\mu\) on both sides of the point estimator \(\bar{X}\). However, many scientific questions call for a directional guarantee. For example:
In certain safety assessemnts, the mean toxin level must not exceed a legal limit.
The average tensile strength must be at least a promised specification to pass a quality control test.
A component’s mean time to failure must be greater than a minimum benchmark.
A drug’s efficacy must be above a specified threshold.
In these situations, a one-sided confidence bound is not only more directly aligned with the research question but also provides a more precise bound than the interval at the same confidence level.
9.4.2. Deriving the Upper Confidence Bound
The Goal
For an upper confidence bound, we are only concerned with the guarantee that the population mean \(\mu\) is below a certain threshold. That is, for a pre-specified confidence coefficient \(C\) between 0 and 1, we would like to find an upper bound of the form \(\bar{X} + ME\) such that
Preliminaries
The language used in deriving confidence bounds overlaps significantly with that of confidence intervals.
The probability \(C\) is called the confidence coefficient and its conversion to percentage is called the confidence level.
\(\alpha = 1 - C\).
A \(z\)-critical value \(z_\alpha\) marks the location on the standard normal pdf whose upper area is \(\alpha\).
Note that \(z_\alpha\) also satisfies \(P(Z < z_\alpha) = C\).
The Assumptions
The assumptions required for validity of the construction is identical to the two-sided case:
\(X_1, X_2, \ldots, X_n\) form an iid sample from the population, \(X\). The expected value and variance of \(X\) is denoted \(\mu\) and \(\sigma^2\), respectively.
Either the population is normally distributed, or we have sufficiently large \(n\) for the CLT to hold.
The population variance \(\sigma^2\) is known.
It follows that the pivotal quantity has the standard normal distribution:
The Derivation
We begin with the property of \(z_\alpha\) which holds by its definition:
Replace \(Z\) with the pivotal quanity:
\[P\left(\frac{\bar{X}-\mu}{\sigma/\sqrt{n}} < z_{\alpha}\right) = C\]Multiply both sides by \(\sigma/\sqrt{n}\):
\[P\left(\bar{X}-\mu < z_{\alpha} \frac{\sigma}{\sqrt{n}}\right) = C\]Subtract \(\bar{X}\) from both sides and multiply by -1 (which reverses the inequality):
\[P\left(\mu < \bar{X} + z_{\alpha} \frac{\sigma}{\sqrt{n}}\right) = C\]
This gives us a probability statement with the same structure as our goal.
Derive the lower confidence bound as an independent exercise 🤔
For lower confidence bounds, we are looking for a guarantee that the population mean \(\mu\) is above a threshold with a high probability \(C\). Since it bounds \(\mu\) from below, it will take the general form \(\bar{X}-ME\). Follow the same flow of logic under the same assumptions, making sure to reflect the change in direction.
Summary
The upper confidence bound (UCB) for an observed mean (lower case \(\bar{x}\)) is:
The lower confidence bound (LCB) for an observed mean is:
An Interval Representation of Confidence Bounds
Confidence bounds are sometimes represented as intervals with one endpoint extending to infinity. The respective interval representations of upper and lower confidence bounds are:
Example💡: Lead Content in Drinking Water
An environmental agency samples \(n = 40\) water taps in a neighborhood. Laboratory analysis reports a sample mean lead concentration of \(\bar{x} = 12.7\) ppb (parts per billion). Historical data suggest the population standard deviation is \(\sigma = 4.5\) ppb. The Environmental Protection Agency (EPA) action level for lead in drinking water is \(15\) ppb.
Using 95% confidence level, construct the appropriate confidence region to answer the question: “Is the mean lead level in this neighborhood safely below the EPA action limit?”
Which confidence region is appropriate?
The goal is to establish an upper boundary which marks a safe zone for the true mean lead concentration level. We will construct a 95% upper confidence bound.
The building blocks
\(\sigma = 4.5\)
\(n=40\)
\(\alpha = 1-C = 0.05\)
Step 1: Compute the critical value
This result can be obtained by running qnorm(0.05, lower.tail=FALSE) on R.
Step 2: Calculate the margin of error
Step 3: Determine the upper bound
Interpretation: We are 95% confident that the true mean lead level in the neighborhood is below 13.87 ppb, which is less than the EPA action limit of 15 ppb. This provides statistical evidence that the neighborhood’s water supply is in compliance with EPA standards.
9.4.3. Comparison of Confidence Intervals and Bounds
It is important to note that a confidence bound does not coincide with either endpoint of the confidence interval at the same confidence level. The table below summarizes the key differences:
Confidence Interval |
Confidence Bound |
|
|---|---|---|
Critical value |
\(z_{\alpha/2}\) |
\(z_\alpha\) |
Margin of error |
\(z_{\alpha/2}\frac{\sigma}{\sqrt{n}}\) |
\(z_{\alpha}\frac{\sigma}{\sqrt{n}}\) |
Size of ME |
Always greater than CB |
Always smaller than CI |
The size difference in the MEs arises naturally from the definitions of their critical values. \(z_\alpha\) must leave out an area twice as large as \(z_{\alpha/2}\) in the upper region of the standard normal pdf. Therefore, it must locate itself closer to 0, resulting in a smaller value than \(z_{\alpha/2}\) (Fig. 9.7).
Fig. 9.7 Comparison of critical values for \(\alpha\) and \(\alpha/2\)
The table below compares the computed values of \(z_\alpha\) and \(z_{\alpha/2}\) for several commonly used \(C= 1-\alpha\):
Confidence Level |
One-sided \(z_{\alpha}\) |
Two-sided \(z_{\alpha/2}\) |
|---|---|---|
90% |
1.282 |
1.645 |
95% |
1.645 |
1.960 |
99% |
2.326 |
2.576 |
This means that under identical experimental conditions (\(C, \sigma\), and \(n\)), a confidence bound will always provide a more precise result than a confidence interval on the side of interest. By concentrating the available resources on a single side rather than splitting them between both bounds, a confidence bound achieves greater efficiency whenever applicable.
Choosing Between Bounds and Intervals
The choice between a one-sided bound and a two-sided interval should be made based on the scientific question, not on which approach gives more favorable results. You may use the guidelines below.
Match the research question: Use one-sided bounds when the research question naturally has a directional component.
Avoid post-hoc selection: Make the decision before collecting or analyzing any data. Choosing the most favorable confidence region after evaluating all possible candidates undermines the validity of the stated confidence level.
9.4.4. Bringing It All Together
Key Takeaways 📝
One-sided confidence bounds are appropriate when the research question involves a directional concern about a parameter.
An upper confidence bound (UCB) \(\bar{x} + z_{\alpha} \frac{\sigma}{\sqrt{n}}\) provides a value that the parameter likely falls below.
A lower confidence bound (LCB) \(\bar{x} - z_{\alpha} \frac{\sigma}{\sqrt{n}}\) provides a value that the parameter likely exceeds.
One-sided bounds use different critical values from two-sided intervals at the same confidence level.
The type of confidence region must be chosen to align with the research question, before collecting and analyzing the data.
Exercises
A battery manufacturer wants to ensure that the mean battery life exceeds \(40\) hours. A sample of \(25\) batteries has a mean life of \(42.3\) hours. Assuming \(\sigma = 5\) hours is known from extensive testing, construct an appropriate 95% confidence region for the mean battery life and formally interpret the reuslt. Does this provide evidence that the mean exceeds \(40\) hours?
Environmental regulations require that the mean concentration of a pollutant in factory discharge not exceed \(3.5\) ppm. A random sample of \(36\) discharge measurements yields \(\bar{x} = 3.2\) ppm. If \(\sigma = 0.8\) ppm, construct an appropriate 99% confidence region and formally interpret the result. Does the factory appear to be in compliance?