Slides 📊
11.1. Statistical Inference for Two Samples
Up to this point, our statistical inference toolkit has focused on single populations—estimating means, testing hypotheses, and quantifying uncertainty for one group at a time. However, many of the most important questions in research and decision-making involve comparisons, which require us to extend our methods to two-sample procedures.
Road Map 🧭
Apply the general logic of statistical inference learned in Chapters 9 and 10 to research questions about comparing two populations.
Recognize the characteristics of a research context that leads to an independent or paired two-sample analysis.
11.1.1. The Comparative Mindset: From Description to Comparison
To answer comparative questions such as:
Does the new medical treatment produce better outcomes than the standard treatment?
Which of two manufacturing processes produces more consistent results?
Did a training program improve employee performance?
Do men and women differ in their response to a particular intervention?
we must focus on characterizing differences between two populations rather than describing the properties of each population in isolation. Before we dive into the relevant inference methods, let us first establish the notation.
Notation for Two-Sample Procedures
When working with two populations \(A\) and \(B\), we distinguish their components using subscripts. Numbers or letters other than \(A\) and \(B\) may be used, as long as their connection to the context is clearly defined.
Population |
A |
B |
|---|---|---|
Population Mean |
\(\mu_A\) |
\(\mu_B\) |
Population Standard Deviation |
\(\sigma_A\) |
\(\sigma_B\) |
Sample Size |
\(n_A\) |
\(n_B\) |
Sample Mean |
\(\bar{X}_A\) |
\(\bar{X}_B\) |
Sample Standard Deviation |
\(S_A\) |
\(S_B\) |
11.1.2. Two Fundamental Scenarios: Independent vs. Paired Samples
The inference method for a two-sample comparison depends on how the samples are associated. In this chapter, we focus on two specific types of association: independent and paired samples.
Independent Samples: Difference of Means
Indpendent samples require that their populations are independent and their sampling procedures do not influence each other. Therefore, their sample sizes \(n_A\) and \(n_B\) need not be equal.
Typical scenarios:
Comparing test scores between students taught with Method A vs. Method B
Measuring blood pressure in two groups of patients, each group receiving Drug A vs. Drug B
Analyzing repair costs for two different car models
Analysis Strategy:
Since there are no guarantees of shared structure between the two populations, we must first summarize each population individually, then compare the summary. Specifically, we will make inference on the difference of their means, \(\mu_A-\mu_B\).
Paired Samples: Mean of Differences
Two samples are considered paired if there is a natural system linking subjects one-to-one across the samples. Consequently, the sample sizes are always equal \((n=n_A=n_B).\)
Typical scenarios:
Before and after measurements on the same patients
Twins receiving different treatments
Left vs. right measurements on the same subjects
Each pair consists of closely related individuals matched on similar background characteristics.
Analysis Strategy:
It is usually expected that there exists considerable variation outside pairs due to extraneous characteristics. An important aspect of a paired two-sample analysis is therefore to minimize their influence on the result. This is achieved by viewing the two groups as working together to generate a single population of differences. Instead of analyzing the data points of Group A and Group B separately, we first compute the pair-wise differences \(D_1, D_2, \cdots, D_n\) and then analyze the mean of the differences, \(\mu_D\).
Example 💡: Independent or Not?
For each case, state whether the samples should be considered paired or independent. Exaplain your reasoning.
Case 1
Sample 1: corns from Farm A in Indiana
Sample 2: corns from Farm B in Ohio
This is better suited for an independent two-sample analysis. In the given context, the number of corns from Farm A and Farm B need not be the same, which already violates a key condition for paired analysis. Moreover, there is no straightforward way to pair an individual corn from Farm A with one from Farm B.
Case 2
Sample 1: Weight of newborn babies in Jan 2023
Sample 2: Weight of the same group of newborn babies in Feb 2023
It is natural to pair each data point from Sample 1 to the data point generated by the same individual in Sample 2. A paired analysis should be used.
Case 3
Sample 1: Output from running Algorithm 1 using 10 different datasets
Sample 2: Output from running Algorithm 2 using the same 10 datasets
Each output in Sample 1 should be compared directly to the output from Sample 2 that used the same dataset, as performance can vary significantly by the data quality. Paired two-sample analysis should be used.
Case 4
Sample 1: Registered cars in Northwest Lafayette BMV
Sample 2: Registered cars in Southeast Lafayette BMV
The number of registered cars in Northwest Lafayette BMV is not constrained to be equal to the number in Southeast Lafayette BMV. No clear pairing system exists. Independent two-sample analysis should be used.
11.1.3. Formulating Hypotheses for Independent Two-Sample Comparisons
In independent two-sample hypothesis testing, the parameter of interest is the difference between the two true means. With this in mind, hypothesis formulation follows a similar set of rules as the one-sample case.
Fig. 11.1 Template for independent two-sample hypotheses
We deonte the null value for a difference with \(\Delta_0\), using the Greek letter “Delta”.
Listing all possible combinations of null and alternative hypotheses from the template yields three distinct tests types:
Upper-Tailed Hypothesis Test
Lower-Tailed Hypothesis Test
Two-Tailed Hypothesis Test
Special Case When the Null Value is Zero
In many cases, the test is on whether there is any difference between the two means, or whether one is larger or smaller than the other. These cases correspond to the special cases with \(\Delta_0 = 0\).
Upper-Tailed Test: “Is \(\mu_A\) greater than \(\mu_B\)?”
Lower-Tailed Test: “Is \(\mu_A\) less than \(\mu_B\)?”
Two-Tailed Test: “Is \(\mu_A\) different than \(\mu_B\)?”
Order Matters
The way we define the difference (\(\mu_A - \mu_B\) vs. \(\mu_B - \mu_A\)) determines the test type. To avoid confusion, clearly link the notation to the context and specify the order of subtraction.
11.1.4. Formulating Hypotheses for Paired Two-Sample Comparisons
Denote the samples from Populations A and B as:
respectively. Assume that each pair was assigned the same index. By taking the difference \(D_i = X_{A_i}-X_{Bi}\) for all \(i=1,2, \cdots, n,\) we obtain a single sample of \(n\) pair-wise differences. Hypothesis tests are then performed on the true mean of the differences, \(\mu_D,\) using one of the three possible formulations:
Upper-Tailed Hypothesis Test
Lower-Tailed Hypothesis Test
Two-Tailed Hypothesis Test
As in the independent analysis, the special cases with \(\Delta_0=0\) correspond to research questions about the direction of the difference (positive or negative) or about the existence of any difference at all.
Specifying the Order of Subtraction is Crucial for Paired Analysis ‼️
In paired two-sample analyses, clarifying the order of subtraction is especially important because the information is not apparent in the hypotheses. The related statement should be included in the first of the four steps of hypothesis testing, as part of parameter definition.
11.1.5. Bringing It All Together
Key Takeaways 📝
Two-sample procedures answer questions about differences between populations or treatments.
When the samples are independent, the two true means \(\mu_A\) and \(\mu_B\) are compared directly.
In paired samples, sampling on one side significantly influences sampling probabilities in the other. For their comparative analysis, the two-sample data is transformed to a single sample of differences \(D_i = X_{Ai} - X_{Bi}.\) The central parameter is the mean of the differences, \(\mu_D\).
Exercises
Identifying Procedures: For each scenario below, determine whether an independent or paired two-sample procedure is appropriate. Explain your reasoning.
Comparing the effectiveness of two different headache medications by giving drug A to one group of patients and drug B to another group
Measuring reaction times before and after participants consume caffeine
Comparing test scores between students in two different schools
Evaluating a new teaching method by comparing pre-test and post-test scores for the same students
Comparing blood pressure medications by giving twins different drugs