Slides 📊
11.2. Independent Two-Sample Analysis When Population Variances Are Known
We now develop the mathematical foundation for comparing the means of two independent populations, assuming that the population variances are known. The simplifying assumption will be lifted in later lessons when we discuss more realistic scenarios.
Road Map 🧭
List the three key assumptions required for the construction of independent two-sample inference methods.
Construct hypothesis tests and confidence regions for the difference in the means of two independent populations. Identify the common underlying principles with one-sample inference.
11.2.1. The Assumptions
The validity of independent two-sample procedures rests on three fundamental assumptions that extend the single-sample framework to comparisons. These assumptions must be carefully verified before applying the methods.
Assumption 1: SRS from Each Population
The random variables \(X_{A1}, X_{A2}, \ldots, X_{An_A}\) form an independent and identically distributed (iid) sample from Population A. Similarly, the random variables \(X_{B1}, X_{B2}, \ldots, X_{Bn_B}\) constitute an iid sample from Population B.
Assumption 2: Independence Between Populations
The observations from one population are independent of those from the other population. Formally, \(X_{Ai}\) is independent of \(X_{Bj}\) for all possible pairs of indices \(i \in \{1, 2, \ldots, n_A\}\) and \(j \in \{1, 2, \ldots, n_B\}\).
Assumption 3: Normality of Sampling Distributions
For each population, either the population distribution is normal, or the sample size is large enough for the CLT to hold.
11.2.2. The Parameter of Interest and Its Point Estimator
The Target Parameter
Recall that our primary interest lies not in the individual population means \(\mu_A\) and \(\mu_B\), but rather in their difference. Our parameter of interest is:
We conceptualize this difference as a single parameter that captures the essence of the comparison we wish to make. Its sign and magnitude indicate the direction and size of any systematic difference between the populations.
The Point Estimator
Since we know that \(\bar{X}_A\) and \(\bar{X}_B\) are unbiased estimators of their respective population means \(\mu_A\) and \(\mu_B\), the natural point estimator for \(\theta\) is:
11.2.3. Theoretical Properties of the Point Estimator
As a result of Assumptions 1 and 3, the sampling distributions of the two sample means are:
Furthermore, the two random variables are independent since their building blocks are independent according to Assumption 2. We establish the theoretical properties of the difference estimator \(\bar{X}_A - \bar{X}_B\) starting from this baseline.
Unbiasedness
The difference in sample means is an unbiased estimator of the difference in population means. To establish this formally:
Therefore, the bias of the estimator is:
Variance of the Estimator
The variance of \(\bar{X}_A - \bar{X}_B\) depends critically on the independence assumption between populations. Recall that for two independent random variables, the variance of their difference equals the sum of their individual variances:
Standard Error
The standard deviation of the estimator, or the standard error, is obtained by taking the square root of the variance:
The Sampling Distribution of the Difference Estimator
We now know the expected value and variance of the difference estimator \(\bar{X}_A - \bar{X}_B\). Additionally, since \(\bar{X}_A\) and \(\bar{X}_B\) are each normally distributed, their difference is also normally distributed. Combining these results, we establish the full sampling distribution of the difference estimator as:
Equivalently, its standardization follows the standard normal distribution:
Based on these key results, we will now build inference methods for the difference in population means.
11.2.4. Hypothesis Testing for the Difference in Means
The four-step hypothesis testing framework extends naturally to the two-sample setting, with modifications to accommodate the comparative nature.
Step 1: Parameter Identification
We must clearly identify both population means using contextually meaningful labels. Rather than generic labels like A and B, we encourage the use of descriptive terms that reflect the populations being studied.
For example, suppose we are comparing the systolic blood pressure of two patient groups after assigning one group a placebo and the other a newly developed treatment. We can define the relevant populations and their true means in the following manner:
Let \(\mu_{\text{treatment}}\) denote the true mean systolic blood pressure of patients who are treated with the new procedure.
Let \(\mu_{\text{control}}\) denote the true mean systolic blood pressure of patients who are not treated with the new medical procedure.
The parameter identification should also specify the units of measurement and provide sufficient context for interpreting the parameter and the target population within the scope of the research question.
Step 2: Hypothesis Formulation
Follow the template:
Fig. 11.2 Template for independent two-sample hypotheses
You may also refer to the details provided in Chapter 11.1.3.
Step 3: Test Statistic and \(p\)-Value
Recall the \(z\)-test statistic used in one-sample hypothesis testing:
By providing a standardized distance between the estimator and the null value, the \(z\)-test statistic measured how far the sample data is from the null assumption.
We define a new \(z\)-test statistic to serve the same purpose by replacing each one-sample component with the appropriate independent two-sample parallel:
As in the one-sample case, under the the null hypothesis,
The \(p\)-value calculation therefore follows the same principles as in single-sample \(z\)-tests:
Revisiting \(p\)-Value Computation |
|
|---|---|
Upper-tailed |
\(P(Z > z_{TS})\) |
Lower-tailed |
\(P(Z < z_{TS})\) |
Two-tailed |
\(2P(Z > |z_{TS}|) = 2P(Z < -|z_{TS}|)\) |
Step 4: Decision and Conclusion
The decision rule remains unchanged from single-sample procedures:
If \(p\)-value \(\leq \alpha\), we reject \(H_0\).
If \(p\)-value \(> \alpha\), we fail to reject \(H_0\).
The conclusion template below is adapted to address the comparative nature of two-sample procedures:
“The data [does/does not] give [some/strong] support (p-value = [value]) to the claim that [statement of \(H_a\) in context about the difference in population means].”
The strength descriptors should reflect the magnitude of the \(p\)-value relative to conventional benchmarks and the significance level used in the study.
Example 💡: Shift Scheduling and Work Efficiency
A retail chain tests two different workforce scheduling systems to see which helps cashiers process more transactions per 8-hour shift. They run independent pilots on different stores:
System A: \(n_A = 25\), \(\bar{x}_A = 50\), \(\sigma_A = 10\) (known)
System B: \(n_B = 30\), \(\bar{x}_B = 45\), \(\sigma_B = 12\) (known)
Perform a hypothesis test to determine whether the true mean numbers of transactions are different at the \(\alpha = 0.05\) significance level.
Step 1: Define the parameters and target populations
Let \(\mu_A\) denote the true mean number of transactions processed by cashiers following System A. Likewise, let \(\mu_B\) be the true mean number of transactions completed by employees following System B.
Step 2: Write the hypotheses
Step 3: Compute the test statistic and p-value
This is a two-sided test, so the \(p\)-value is \(2P(Z > 1.68) \approx 0.093\).
Step 4: Decision and Conclusion
Since \(p\)-value \(= 0.093 > 0.05\), we fail to reject the null hypothesis at the 5% significance level. We do not have enough evidence to support the claim that the mean number of transactions fulfilled by cashiers are different by the scheduling system.
11.2.5. Confidence Regions for the Difference in Means
Let us begin by constructing a \(100C\%\) confidence interval. The goal is to find the margin of error (ME) such that
The Pivotal Method
We begin with the known truth:
Replace \(Z\) with the standardization of the difference estimator, or the pivotal quantity:
Through algebraic manipulation to isolate \(\mu_A - \mu_B\) in the center of the inequality, we obtain
with \(ME = z_{\alpha/2} \sqrt{\frac{\sigma^2_A}{n_A} + \frac{\sigma^2_B}{n_B}}\).
For a single experiment, therefore, the \(100C \%\) confidence interval is:
Complete Summary of Confidence Intervals and Bounds
The one-sided confidence bounds can be derived similarly. We leave the details as an excercise—use the confidence interval derivation above and Chapter 9.4 as reference.
\(100\cdot (1-\alpha) \%\) Confidence Regions for Difference in Means |
|
|---|---|
Confidence Interval |
\[(\bar{x}_A - \bar{x}_B) \pm z_{\alpha/2} \sqrt{\frac{\sigma^2_A}{n_A} + \frac{\sigma^2_B}{n_B}}\]
|
Upper Confidence Bound |
\[(\bar{x}_A - \bar{x}_B) + z_{\alpha} \sqrt{\frac{\sigma^2_A}{n_A} + \frac{\sigma^2_B}{n_B}}\]
|
Lower Confidence Bound |
\[(\bar{x}_A - \bar{x}_B) - z_{\alpha} \sqrt{\frac{\sigma^2_A}{n_A} + \frac{\sigma^2_B}{n_B}}\]
|
Note the repeated core elements. In both one-sample and two-sample cases, a confidence region is centered at the point estimate and expands in the appropiate directions by ME, computed as a product of a critical value and the standard error.
Interpreting Confidence Regions for Difference in Means
The confidence regions provide a range of plausible values for the true difference in population means; each region captures the true difference \(\mu_A - \mu_B\) with \(100(1-\alpha)\%\) confidence. Their precision depends on the confidence level, the population variances, and the sample sizes.
Example 💡: Shift Scheduling and Work Efficiency, Continued
For the experiment on two workforce scheduling systems with:
System A: \(n_A = 25\), \(\bar{x}_A = 50\), \(\sigma_A = 10\) (known)
System B: \(n_B = 30\), \(\bar{x}_B = 45\), \(\sigma_B = 12\) (known)
Compute the \(95 \%\) confidence interval for the difference of the two population means. Check if the result is consistent with the hypothesis test performed in the previous example.
Identify the components
The observed sample difference
\[\bar{x}_A - \bar{x}_B = 50 - 45 = 5\]The standard error
\[\sqrt{\frac{\sigma^2_A}{n_A} + \frac{\sigma^2_B}{n_B}} = \sqrt{\frac{10^2}{25} + \frac{12^2}{30}} \approx 2.97\]The \(z\)-critical value
qnorm(0.025, lower.tail=FALSE) # returns 1.96
Put the parts together
The confidence interval is:
Is it consistent with the hypothesis test?
The interval contains zero, which is consistent with our failure to reject the null hypothesis of equal means.
11.2.6. Bringing It All Together
Key Takeaways 📝
Two-sample independent procedures are designed to provide statistical answers to comparative questions. They require the key assumptions that (1) each sample is an SRS of the respective population, (2) the two samples are independent from each other, (3) and the CLT holds in each sample.
The sampling distribution of the point estimator \(\bar{X}_A - \bar{X}_B\) is normal with mean \(\mu_A - \mu_B\) and variance \(\frac{\sigma^2_A}{n_A} + \frac{\sigma^2_B}{n_B}\). The addition of variances follows from the independence assumption between groups.
The construction of hypothesis tests and confidence regions follows the same core principles as in the one-sample case.
Exercises
Assumption Analysis: For each of the following research scenarios, identify which assumptions might be violated and explain the potential consequences:
Comparing test scores between students in the same classroom, where some students work together.
Measuring reaction times before and after caffeine consumption using the same participants.
Comparing heights between adult males and females using a convenience sample from a shopping mall
Hypothesis Formulation: A manufacturer claims their new battery lasts at least 2 hours longer than the competitor’s battery. Set up appropriate hypotheses for testing this claim, clearly defining your parameters and explaining your choice of \(\Delta_0\).
Standard Error Calculation: Two independent samples have \(n_A = 16\), \(\sigma_A = 8\), \(n_B = 25\), and \(\sigma_B = 10\). Calculate the standard error of \(\bar{X}_A - \bar{X}_B\) and explain what this value represents in practical terms.