11.1. Statistical Inference for Two Samples

Up to this point, our statistical inference toolkit has focused on single populations—estimating means, testing hypotheses, and quantifying uncertainty for one group at a time. However, many of the most important questions in research and decision-making involve comparisons, which require us to extend our methods to two-sample procedures.

Road Map 🧭

  • Apply the general logic of statistical inference learned in Chapters 9 and 10 to research questions about comparing two populations.

  • Recognize the characteristics of a research context that leads to an independent or paired two-sample analysis.

11.1.1. The Comparative Mindset: From Description to Comparison

To answer comparative questions such as:

  • Does the new medical treatment produce better outcomes than the standard treatment?

  • Which of two manufacturing processes produces more consistent results?

  • Did a training program improve employee performance?

  • Do men and women differ in their response to a particular intervention?

we must focus on characterizing differences between two populations rather than describing the properties of each population in isolation. Before we dive into the relevant inference methods, let us first establish the notation.

Notation for Two-Sample Procedures

When working with two populations \(A\) and \(B\), we distinguish their components using subscripts. Numbers or letters other than \(A\) and \(B\) may be used, as long as their connection to the context is clearly defined.

Population

A

B

Population Mean

\(\mu_A\)

\(\mu_B\)

Population Standard Deviation

\(\sigma_A\)

\(\sigma_B\)

Sample Size

\(n_A\)

\(n_B\)

Sample Mean

\(\bar{X}_A\)

\(\bar{X}_B\)

Sample Standard Deviation

\(S_A\)

\(S_B\)

11.1.2. Two Fundamental Scenarios: Independent vs. Paired Samples

The inference method for a two-sample comparison depends on how the samples are associated. In this chapter, we focus on two specific types of association: independent and paired samples.

Independent Samples: Difference of Means

Indpendent samples require that their populations are independent and their sampling procedures do not influence each other. Therefore, their sample sizes \(n_A\) and \(n_B\) need not be equal.

Typical scenarios:

  • Comparing test scores between students taught with Method A vs. Method B

  • Measuring blood pressure in two groups of patients, each group receiving Drug A vs. Drug B

  • Analyzing repair costs for two different car models

Analysis Strategy:

Since there are no guarantees of shared structure between the two populations, we must first summarize each population individually, then compare the summary. Specifically, we will make inference on the difference of their means, \(\mu_A-\mu_B\).

Paired Samples: Mean of Differences

Two samples are considered paired if there is a natural system linking subjects one-to-one across the samples. Consequently, the sample sizes are always equal \((n=n_A=n_B).\)

Typical scenarios:

  • Before and after measurements on the same patients

  • Twins receiving different treatments

  • Left vs. right measurements on the same subjects

Each pair consists of closely related individuals matched on similar background characteristics.

Analysis Strategy:

It is usually expected that there exists considerable variation outside pairs due to extraneous characteristics. An important aspect of a paired two-sample analysis is therefore to minimize their influence on the result. This is achieved by viewing the two groups as working together to generate a single population of differences. Instead of analyzing the data points of Group A and Group B separately, we first compute the pair-wise differences \(D_1, D_2, \cdots, D_n\) and then analyze the mean of the differences, \(\mu_D\).

Example 💡: Independent or Not?

For each case, state whether the samples should be considered paired or independent. Exaplain your reasoning.

Case 1

  • Sample 1: corns from Farm A in Indiana

  • Sample 2: corns from Farm B in Ohio

This is better suited for an independent two-sample analysis. In the given context, the number of corns from Farm A and Farm B need not be the same, which already violates a key condition for paired analysis. Moreover, there is no straightforward way to pair an individual corn from Farm A with one from Farm B.

Case 2

  • Sample 1: Weight of newborn babies in Jan 2023

  • Sample 2: Weight of the same group of newborn babies in Feb 2023

It is natural to pair each data point from Sample 1 to the data point generated by the same individual in Sample 2. A paired analysis should be used.

Case 3

  • Sample 1: Output from running Algorithm 1 using 10 different datasets

  • Sample 2: Output from running Algorithm 2 using the same 10 datasets

Each output in Sample 1 should be compared directly to the output from Sample 2 that used the same dataset, as performance can vary significantly by the data quality. Paired two-sample analysis should be used.

Case 4

  • Sample 1: Registered cars in Northwest Lafayette BMV

  • Sample 2: Registered cars in Southeast Lafayette BMV

The number of registered cars in Northwest Lafayette BMV is not constrained to be equal to the number in Southeast Lafayette BMV. No clear pairing system exists. Independent two-sample analysis should be used.

11.1.3. Formulating Hypotheses for Independent Two-Sample Comparisons

In independent two-sample hypothesis testing, the parameter of interest is the difference between the two true means. With this in mind, hypothesis formulation follows a similar set of rules as the one-sample case.

Template for independent two-sample hypotheses

Fig. 11.1 Template for independent two-sample hypotheses

We deonte the null value for a difference with \(\Delta_0\), using the Greek letter “Delta”.

Listing all possible combinations of null and alternative hypotheses from the template yields three distinct tests types:

Upper-Tailed Hypothesis Test

\[\begin{split}&H_0: \mu_A - \mu_B \leq \Delta_0\\ &H_a: \mu_A - \mu_B > \Delta_0\end{split}\]

Lower-Tailed Hypothesis Test

\[\begin{split}&H_0: \mu_A - \mu_B \geq \Delta_0\\ &H_a: \mu_A - \mu_B < \Delta_0\end{split}\]

Two-Tailed Hypothesis Test

\[\begin{split}&H_0: \mu_A - \mu_B = \Delta_0\\ &H_a: \mu_A - \mu_B \neq \Delta_0\end{split}\]

Special Case When the Null Value is Zero

In many cases, the test is on whether there is any difference between the two means, or whether one is larger or smaller than the other. These cases correspond to the special cases with \(\Delta_0 = 0\).

Upper-Tailed Test: “Is \(\mu_A\) greater than \(\mu_B\)?”

\[\begin{split}&H_0: \mu_A - \mu_B \leq 0\\ &H_a: \mu_A - \mu_B > 0\end{split}\]

Lower-Tailed Test: “Is \(\mu_A\) less than \(\mu_B\)?”

\[\begin{split}&H_0: \mu_A - \mu_B \geq 0\\ &H_a: \mu_A - \mu_B < 0\end{split}\]

Two-Tailed Test: “Is \(\mu_A\) different than \(\mu_B\)?”

\[\begin{split}&H_0: \mu_A - \mu_B = 0\\ &H_a: \mu_A - \mu_B \neq 0\end{split}\]

Order Matters

The way we define the difference (\(\mu_A - \mu_B\) vs. \(\mu_B - \mu_A\)) determines the test type. To avoid confusion, clearly link the notation to the context and specify the order of subtraction.

11.1.4. Formulating Hypotheses for Paired Two-Sample Comparisons

Denote the samples from Populations A and B as:

\[X_{A1}, X_{A2}, \cdots, X_{An} \quad \text{ and } \quad X_{B1}, X_{B2}, \cdots, X_{Bn},\]

respectively. Assume that each pair was assigned the same index. By taking the difference \(D_i = X_{A_i}-X_{Bi}\) for all \(i=1,2, \cdots, n,\) we obtain a single sample of \(n\) pair-wise differences. Hypothesis tests are then performed on the true mean of the differences, \(\mu_D,\) using one of the three possible formulations:

Upper-Tailed Hypothesis Test

\[\begin{split}&H_0: \mu_D \leq \Delta_0\\ &H_a: \mu_D > \Delta_0\end{split}\]

Lower-Tailed Hypothesis Test

\[\begin{split}&H_0: \mu_D \geq \Delta_0\\ &H_a: \mu_D < \Delta_0\end{split}\]

Two-Tailed Hypothesis Test

\[\begin{split}&H_0: \mu_D = \Delta_0\\ &H_a: \mu_D \neq \Delta_0\end{split}\]

As in the independent analysis, the special cases with \(\Delta_0=0\) correspond to research questions about the direction of the difference (positive or negative) or about the existence of any difference at all.

Specifying the Order of Subtraction is Crucial for Paired Analysis ‼️

In paired two-sample analyses, clarifying the order of subtraction is especially important because the information is not apparent in the hypotheses. The related statement should be included in the first of the four steps of hypothesis testing, as part of parameter definition.

11.1.5. Bringing It All Together

Key Takeaways 📝

  1. Two-sample procedures answer questions about differences between populations or treatments.

  2. When the samples are independent, the two true means \(\mu_A\) and \(\mu_B\) are compared directly.

  3. In paired samples, sampling on one side significantly influences sampling probabilities in the other. For their comparative analysis, the two-sample data is transformed to a single sample of differences \(D_i = X_{Ai} - X_{Bi}.\) The central parameter is the mean of the differences, \(\mu_D\).

11.1.6. Exercises

Exercise 1: Identifying Independent vs Paired Samples

For each research scenario, determine whether an independent or paired two-sample procedure is appropriate. Explain your reasoning.

  1. Comparing fuel efficiency between two different car models by testing 30 vehicles of each model.

  2. Measuring blood pressure before and after administering a new medication to the same 25 patients.

  3. Comparing algorithm runtime by running Algorithm A on 15 datasets and Algorithm B on 15 different datasets.

  4. Testing whether a tutoring program improves test scores by comparing pre-test and post-test results for 40 students.

  5. Comparing tensile strength of steel produced by two different manufacturing processes, with 20 samples from each process.

  6. Evaluating two different software optimization techniques by applying both to the same 12 programs and measuring execution time.

Solution

Part (a): Independent

The 30 vehicles of Model A and 30 vehicles of Model B are separate groups with no natural pairing. Each car in one group has no specific connection to any car in the other group.

Part (b): Paired

Each patient serves as their own control—the “before” measurement is naturally paired with the “after” measurement for the same individual. This controls for individual variation in baseline blood pressure.

Part (c): Independent

Although both algorithms are being tested, they’re run on different datasets. There’s no natural pairing between a dataset used for Algorithm A and one used for Algorithm B.

Note: If both algorithms were run on the same datasets, this would become a paired design—a valuable approach for comparing algorithms because it controls for dataset-specific variation. Pairing is often a design choice.

Part (d): Paired

Each student’s pre-test score is paired with their post-test score. The same individual generates both measurements, creating natural pairs.

Part (e): Independent

The steel samples from Process A and Process B come from different production runs with no natural pairing structure. Sample sizes need not be equal.

Part (f): Paired

Both optimization techniques are applied to the same programs. The execution time under Technique A for Program 1 should be compared directly to the time under Technique B for Program 1, controlling for program-specific complexity.


Exercise 2: Notation Practice

A quality engineer compares the precision of two measurement instruments. Instrument A was used for 18 measurements with mean 50.2 and standard deviation 2.1. Instrument B was used for 22 measurements with mean 49.8 and standard deviation 2.5.

  1. Identify and write all the relevant notation: \(n_A, n_B, \bar{x}_A, \bar{x}_B, s_A, s_B\).

  2. What is the parameter of interest if comparing independent populations?

  3. Calculate the point estimate for the difference in population means.

  4. If the true population standard deviations were known to be \(\sigma_A = 2.0\) and \(\sigma_B = 2.4\), write the formula for the standard error of \(\bar{X}_A - \bar{X}_B\).

Solution

Part (a): Notation

  • \(n_A = 18\) (sample size for Instrument A)

  • \(n_B = 22\) (sample size for Instrument B)

  • \(\bar{x}_A = 50.2\) (sample mean for Instrument A)

  • \(\bar{x}_B = 49.8\) (sample mean for Instrument B)

  • \(s_A = 2.1\) (sample standard deviation for Instrument A)

  • \(s_B = 2.5\) (sample standard deviation for Instrument B)

Part (b): Parameter of interest

\(\mu_A - \mu_B\), the difference between the true mean measurements from Instrument A and Instrument B.

Part (c): Point estimate

\[\bar{x}_A - \bar{x}_B = 50.2 - 49.8 = 0.4\]

Part (d): Standard error formula

\[\sigma_{\bar{X}_A - \bar{X}_B} = \sqrt{\frac{\sigma^2_A}{n_A} + \frac{\sigma^2_B}{n_B}} = \sqrt{\frac{2.0^2}{18} + \frac{2.4^2}{22}} = \sqrt{\frac{4}{18} + \frac{5.76}{22}} = \sqrt{0.222 + 0.262} = \sqrt{0.484} \approx 0.696\]

Exercise 3: Writing Hypotheses for Independent Samples

For each research question, write the appropriate null and alternative hypotheses. Define your parameters clearly.

  1. A pharmaceutical company wants to test whether their new pain reliever provides faster relief than the current market leader.

  2. An engineer wants to determine if two assembly lines produce components with different mean weights.

  3. A researcher suspects that students using a new study app score at least 5 points higher on average than students using traditional methods.

  4. A data scientist wants to test whether the mean response time of Server A is less than that of Server B.

Solution

Part (a): Pain reliever comparison

Let \(\mu_{new}\) = true mean time to pain relief (minutes) for the new drug. Let \(\mu_{current}\) = true mean time to pain relief (minutes) for the current market leader.

“Faster relief” means smaller time for the new drug:

\[\begin{split}&H_0: \mu_{new} - \mu_{current} \geq 0\\ &H_a: \mu_{new} - \mu_{current} < 0\end{split}\]

Part (b): Assembly line weights

Let \(\mu_A\) = true mean weight of components from Line A. Let \(\mu_B\) = true mean weight of components from Line B.

Testing for any difference:

\[\begin{split}&H_0: \mu_A - \mu_B = 0\\ &H_a: \mu_A - \mu_B \neq 0\end{split}\]

Part (c): Study app effectiveness

Let \(\mu_{app}\) = true mean exam score for students using the app. Let \(\mu_{trad}\) = true mean exam score for students using traditional methods.

Testing if the app improves scores by at least 5 points:

\[\begin{split}&H_0: \mu_{app} - \mu_{trad} \leq 5\\ &H_a: \mu_{app} - \mu_{trad} > 5\end{split}\]

Note: Here \(\Delta_0 = 5\), not 0.

Part (d): Server response times

Let \(\mu_A\) = true mean response time for Server A. Let \(\mu_B\) = true mean response time for Server B.

Testing if Server A is faster (lower time):

\[\begin{split}&H_0: \mu_A - \mu_B \geq 0\\ &H_a: \mu_A - \mu_B < 0\end{split}\]

Exercise 4: Writing Hypotheses for Paired Samples

For each paired scenario, define the difference appropriately and write the hypotheses.

  1. A fitness trainer measures clients’ resting heart rate before and after a 12-week exercise program. The trainer expects the program to lower heart rate.

  2. A software company tests response time of their application before and after a code optimization. They want to know if there’s any change.

  3. An agricultural researcher applies two fertilizers to adjacent plots from the same field and measures crop yield. They want to test if Fertilizer A produces higher yields than Fertilizer B.

Solution

Part (a): Exercise program and heart rate

Define: \(D_i = \text{Heart Rate}_{\text{before}} - \text{Heart Rate}_{\text{after}}\) for each client.

Let \(\mu_D\) = true mean difference in heart rate (before - after).

If the program lowers heart rate, then “before” > “after”, so \(D > 0\):

\[\begin{split}&H_0: \mu_D \leq 0\\ &H_a: \mu_D > 0\end{split}\]

Part (b): Code optimization

Define: \(D_i = \text{Response Time}_{\text{before}} - \text{Response Time}_{\text{after}}\).

Let \(\mu_D\) = true mean difference in response time.

Testing for any change (two-sided):

\[\begin{split}&H_0: \mu_D = 0\\ &H_a: \mu_D \neq 0\end{split}\]

Part (c): Fertilizer comparison

Define: \(D_i = \text{Yield}_A - \text{Yield}_B\) for each paired plot.

Let \(\mu_D\) = true mean difference in yield (A - B).

If Fertilizer A produces higher yields, then \(D > 0\):

\[\begin{split}&H_0: \mu_D \leq 0\\ &H_a: \mu_D > 0\end{split}\]

Exercise 5: Order of Subtraction Matters

A clinical trial compares a new treatment (N) versus a placebo (P) for reducing cholesterol levels.

  1. If the difference is defined as \(D = X_N - X_P\) and researchers expect the new treatment to lower cholesterol more than the placebo, write the appropriate hypotheses.

  2. If the difference is defined as \(D = X_P - X_N\) for the same research question, write the appropriate hypotheses.

  3. Suppose \(\bar{d} = -15\) mg/dL when using definition (a). What would \(\bar{d}\) be using definition (b)?

  4. Explain why it’s crucial to clearly specify the order of subtraction in Step 1 of hypothesis testing.

Solution

Part (a): D = Treatment - Placebo

If the treatment lowers cholesterol more than placebo, then patients on treatment have lower cholesterol levels, so \(X_N < X_P\), meaning \(D = X_N - X_P < 0\).

\[\begin{split}&H_0: \mu_D \geq 0\\ &H_a: \mu_D < 0\end{split}\]

Part (b): D = Placebo - Treatment

With this definition, if treatment is more effective (lower cholesterol), then \(X_P > X_N\), so \(D = X_P - X_N > 0\).

\[\begin{split}&H_0: \mu_D \leq 0\\ &H_a: \mu_D > 0\end{split}\]

Part (c): Relationship between definitions

If \(\bar{d} = -15\) using \(D = X_N - X_P\), then using \(D = X_P - X_N\):

\[\bar{d} = -(-15) = +15 \text{ mg/dL}\]

The magnitude is the same; only the sign changes.

Part (d): Why specify the order

The order of subtraction determines:

  • The sign of the test statistic

  • Which tail of the distribution corresponds to the alternative hypothesis

  • How to interpret positive vs. negative differences

Without clear specification, results can be misinterpreted. A “positive difference” means opposite things depending on the definition.


Exercise 6: When is Δ₀ ≠ 0?

In most two-sample comparisons, we test whether there’s any difference (\(\Delta_0 = 0\)). However, sometimes we need a non-zero null value.

For each scenario, identify the appropriate \(\Delta_0\) and write the hypotheses.

  1. A battery manufacturer claims their premium battery lasts at least 3 hours longer than their standard battery. A consumer group wants to test this claim.

  2. A drug must reduce blood pressure by more than 10 mmHg compared to placebo to be considered clinically significant. Researchers want to test if the drug meets this threshold.

  3. Two machines are supposed to produce parts with the same mean diameter. Quality control will intervene if the means differ by more than 0.5 mm.

Solution

Part (a): Battery life claim

Let \(\mu_P\) = true mean battery life for premium battery. Let \(\mu_S\) = true mean battery life for standard battery.

The manufacturer claims \(\mu_P - \mu_S \geq 3\). The consumer group wants to disprove this claim.

\(\Delta_0 = 3\) hours

\[\begin{split}&H_0: \mu_P - \mu_S \geq 3\\ &H_a: \mu_P - \mu_S < 3\end{split}\]

Part (b): Clinical significance threshold

Define \(Y\) = blood pressure reduction (baseline minus follow-up), so larger \(Y\) means more reduction.

Let \(\mu_{drug}\) = true mean reduction for drug group. Let \(\mu_{plac}\) = true mean reduction for placebo group.

Testing if the drug exceeds the 10 mmHg clinical threshold compared to placebo:

\(\Delta_0 = 10\) mmHg

\[\begin{split}&H_0: \mu_{drug} - \mu_{plac} \leq 10\\ &H_a: \mu_{drug} - \mu_{plac} > 10\end{split}\]

Part (c): Machine calibration (tolerance problem)

Let \(\mu_1\) = true mean diameter from Machine 1. Let \(\mu_2\) = true mean diameter from Machine 2.

Quality control will intervene if \(|\mu_1 - \mu_2| > 0.5\) mm. This is a tolerance or equivalence-style problem.

Approach 1: Confidence Interval Method (Recommended for this course)

Construct a two-sided confidence interval for \(\mu_1 - \mu_2\) and check whether the entire interval falls within (−0.5, 0.5):

  • If the CI lies entirely within (−0.5, 0.5), the machines are acceptably similar.

  • If the CI extends beyond ±0.5 in either direction, intervention may be needed.

Note: This is not a standard “test at Δ₀ = 0” problem. Using Δ₀ = 0 tests whether the means are exactly equal, not whether they are “close enough.”

Approach 2: Two One-Sided Tests (TOST) - Beyond STAT 350

For formal equivalence testing, one would test:

  • H₀: \(\mu_1 - \mu_2 \leq -0.5\) vs. Hₐ: \(\mu_1 - \mu_2 > -0.5\), AND

  • H₀: \(\mu_1 - \mu_2 \geq 0.5\) vs. Hₐ: \(\mu_1 - \mu_2 < 0.5\)

If both are rejected, conclude equivalence within ±0.5 mm. This advanced method is not covered in STAT 350.


Exercise 7: Identifying the Correct Procedure

For each scenario, identify:

  • Whether to use independent or paired samples

  • The appropriate parameter (\(\mu_A - \mu_B\) or \(\mu_D\))

  • What additional assumptions might be needed

  1. Comparing crash test ratings between SUVs and sedans (15 of each type tested).

  2. Testing whether students perform differently on paper vs. computer-based exams by having 30 students take both versions.

  3. Comparing customer satisfaction between two restaurant locations by surveying 50 customers at each location.

  4. Evaluating a new keyboard design by measuring typing speed before and after users switch to the new keyboard.

  5. Comparing the accuracy of two different machine learning models by testing both on the same 100 datasets.

Solution

Part (a): Crash test ratings

  • Procedure: Independent samples

  • Parameter: \(\mu_{SUV} - \mu_{sedan}\)

  • Assumptions: Independence between groups, approximate normality in each group (or large enough samples for CLT)

Part (b): Paper vs. computer exams

  • Procedure: Paired samples

  • Parameter: \(\mu_D\) where \(D_i = \text{Paper}_i - \text{Computer}_i\)

  • Assumptions: Independence between student pairs, normality of differences

Note: Order effects (which test taken first) should be balanced in the design.

Part (c): Restaurant satisfaction

  • Procedure: Independent samples

  • Parameter: \(\mu_A - \mu_B\) (satisfaction scores at Location A vs. B)

  • Assumptions: Independence within and between groups, approximate normality or large samples

Part (d): Keyboard typing speed

  • Procedure: Paired samples

  • Parameter: \(\mu_D\) where \(D_i = \text{New}_i - \text{Old}_i\)

  • Assumptions: Independence between users, normality of differences

Part (e): ML model accuracy

  • Procedure: Paired samples

  • Parameter: \(\mu_D\) where \(D_i = \text{Accuracy}_{Model1,i} - \text{Accuracy}_{Model2,i}\)

  • Assumptions: Independence between datasets, normality of differences

Pairing by dataset controls for variation in dataset difficulty.


Exercise 8: True/False Conceptual Questions

Determine whether each statement is True or False. Provide a brief justification.

  1. In a paired two-sample design, the sample sizes must be equal.

  2. For independent samples, \(\mu_A - \mu_B\) and \(\mu_B - \mu_A\) lead to the same hypothesis test conclusions.

  3. Paired designs always have higher power than independent designs.

  4. The order of subtraction in paired analysis affects the sign of the test statistic but not the p-value for a two-sided test.

  5. If two populations are independent, we can still use a paired analysis if we arbitrarily pair observations.

  6. The parameter \(\mu_D\) in paired analysis equals \(\mu_A - \mu_B\) in value.

Solution
  1. True — In paired samples, each observation in Sample A is matched with exactly one observation in Sample B, so \(n_A = n_B = n\) necessarily.

  2. True for two-sided tests; False for one-sided tests — For two-sided tests (Hₐ: μ_A ≠ μ_B), switching the order changes the sign of the test statistic but the p-value remains the same since we consider both tails. For one-sided tests, reversing the subtraction reverses the direction of the alternative hypothesis, which can lead to opposite conclusions if not carefully adjusted.

  3. False — Paired designs have higher power when there is substantial within-pair correlation. If pairs are essentially independent (low correlation), independent designs may have comparable or better power due to more degrees of freedom.

  4. True — For two-sided tests, we use \(2P(|T| > |t_{TS}|)\). The absolute value makes the p-value the same regardless of sign.

  5. False — Arbitrary pairing of independent observations does not create meaningful structure. It would reduce degrees of freedom (\(n-1\) instead of \(n_A + n_B - 2\)) without the benefit of controlling for variability, resulting in power loss.

  6. True\(\mu_D = E[D_i] = E[X_{Ai} - X_{Bi}] = \mu_A - \mu_B\). The expected value of the differences equals the difference of expected values.


Exercise 9: Practical Design Considerations

A biomedical engineer wants to compare the accuracy of two glucose monitoring devices. They have access to 40 patients.

  1. Describe how to implement an independent samples design.

  2. Describe how to implement a paired samples design.

  3. What are the advantages of the paired design in this context?

  4. What are potential drawbacks of the paired design?

  5. Which design would you recommend and why?

Solution

Part (a): Independent samples design

  • Randomly assign 20 patients to use Device A

  • Randomly assign 20 patients to use Device B

  • Compare mean accuracy between the two independent groups

Part (b): Paired samples design

  • Have all 40 patients use both Device A and Device B

  • For each patient, calculate \(D_i = \text{Accuracy}_A - \text{Accuracy}_B\)

  • Analyze the differences using one-sample t-procedures

Part (c): Advantages of paired design

  • Controls for patient variability: Factors like blood glucose variability, skin type, and activity level affect both measurements equally and cancel out when differencing.

  • More powerful: By eliminating between-patient variation, the paired design can detect smaller differences.

  • Requires fewer patients: With proper pairing, 40 patients in a paired design may have more power than 40 patients split into two independent groups.

Part (d): Potential drawbacks

  • Carryover effects: If using one device affects the reading of the other (e.g., skin irritation), results may be biased.

  • Order effects: The order of device use should be randomized and balanced.

  • Time/cost: Each patient must be tested with both devices, potentially doubling measurement time.

  • Dropouts: If a patient drops out after using only one device, that data point is lost entirely.

Part (e): Recommendation

The paired design is recommended because:

  • Patient-to-patient variability in glucose levels is likely substantial

  • Both devices can be tested on the same blood sample or close in time

  • This controls for a major source of variability

To address drawbacks, randomize the order of device use and balance it across patients.


11.1.7. Additional Practice Problems

True/False Questions (1 point each)

  1. Independent two-sample procedures require equal sample sizes.

    Ⓣ or Ⓕ

  2. In a paired design, observations within each pair are typically correlated.

    Ⓣ or Ⓕ

  3. The null value \(\Delta_0\) must always equal zero.

    Ⓣ or Ⓕ

  4. Before-and-after studies always require paired analysis.

    Ⓣ or Ⓕ

  5. In paired designs, \(\bar{d}\) always equals \(\bar{x}_A - \bar{x}_B\) when all pairs are complete.

    Ⓣ or Ⓕ

  6. Paired procedures have degrees of freedom \(n_A + n_B - 2\).

    Ⓣ or Ⓕ

Multiple Choice Questions (2 points each)

  1. Which scenario is best suited for a paired two-sample analysis?

    Ⓐ Comparing average height between male and female students

    Ⓑ Comparing test scores before and after a training program for the same employees

    Ⓒ Comparing customer satisfaction at two different stores

    Ⓓ Comparing defect rates between two manufacturing plants

  2. For independent samples, the parameter of interest is:

    \(\mu_D\)

    \(\mu_A - \mu_B\)

    \(\sigma_A - \sigma_B\)

    \(\bar{X}_A - \bar{X}_B\)

  3. When should \(\Delta_0 \neq 0\) be used in the hypotheses?

    Ⓐ When sample sizes are unequal

    Ⓑ When testing for a specific minimum difference

    Ⓒ When using paired samples

    Ⓓ When population variances are unknown

  4. A key advantage of paired designs over independent designs is:

    Ⓐ Larger degrees of freedom

    Ⓑ Control of extraneous variability

    Ⓒ Unequal sample sizes are allowed

    Ⓓ No assumptions are required

  5. In a paired analysis where \(D_i = X_{Ai} - X_{Bi}\), if \(\mu_D > 0\), then:

    Ⓐ Population A has a smaller mean than Population B

    Ⓑ Population A has a larger mean than Population B

    Ⓒ The populations have equal means

    Ⓓ Cannot be determined without more information

  6. Which assumption is required for independent two-sample procedures but NOT for paired procedures?

    Ⓐ Normality of the sampling distribution

    Ⓑ Independence between the two samples

    Ⓒ Random sampling

    Ⓓ Known population means

Answers to Practice Problems

True/False Answers:

  1. False — Independent samples can have different sizes (\(n_A \neq n_B\)).

  2. True — The whole point of pairing is that observations are linked, typically creating positive correlation.

  3. False\(\Delta_0\) can be any value based on the research question (e.g., testing if difference exceeds 5 points).

  4. False — “Before and after” studies only require paired analysis when the same individuals are measured at both times. If different individuals are measured before and after an intervention period (e.g., repeated cross-sections), independent samples methods apply.

  5. True — When all pairs are complete, \(\bar{d} = \bar{x}_A - \bar{x}_B\) algebraically. However, in paired designs the primary parameter is \(\mu_D\), and the standard error is computed from the differences, not from the two samples separately.

  6. False — Paired procedures have \(df = n - 1\) where n is the number of pairs.

Multiple Choice Answers:

  1. — Same employees measured twice creates natural pairing.

  2. — The difference in population means is the parameter; \(\bar{X}_A - \bar{X}_B\) is the estimator.

  3. — Non-zero null values test for specific minimum or maximum differences.

  4. — Pairing controls for individual/subject variability.

  5. — If \(\mu_D = \mu_A - \mu_B > 0\), then \(\mu_A > \mu_B\).

  6. — Independent procedures require the two samples to be independent; paired procedures require independence between pairs, not between the two measurements within a pair.