8.5. Examples of Experimental Design

Understanding experimental design principles and frameworks provides the theoretical foundation for conducting rigorous research, but translating these concepts into practice requires experience with concrete examples. Real-world research scenarios rarely fit perfectly into textbook categories, and learning to recognize which design is most appropriate for a given situation—and how to implement it effectively—requires working through detailed examples that illustrate the decision-making process.

The examples we’ll explore in this section demonstrate how researchers identify key features of their research context, match those features to appropriate design frameworks, and implement the designs while addressing practical constraints. Each example builds on the principles and design types we’ve studied, showing how they work together in realistic research scenarios.

Road Map 🧭

  • Problem: How do we apply experimental design concepts to real research scenarios where multiple factors and constraints must be balanced?

  • Tool: Detailed worked examples that demonstrate the complete process from research question to design implementation

  • Pipeline: These examples provide templates for recognizing design types and implementing them effectively in practice

8.5.1. The Framework for Design Analysis

Before examining specific examples, it’s helpful to establish a systematic approach for analyzing experimental designs. Whether you’re designing a new study or evaluating an existing one, the same set of questions provides a complete picture of the design structure and implementation.

Essential Design Questions

For any experimental study, we need to identify:

  1. What type of experimental design is used? (CRD, RBD, Matched Pairs)

  2. What are the experimental units? (The subjects or objects receiving treatments)

  3. What are the treatments? (The specific interventions being applied)

  4. What are the factors of interest? (The variables we want to study)

  5. How many levels do the factors have? (The different values or categories for each factor)

  6. What is the response variable? (The outcome we’re measuring)

The Logic Behind These Questions

These questions follow a logical sequence that mirrors the experimental design process:

Design Type determines the overall framework and tells us how treatments are assigned and controls are implemented.

Experimental Units identify who or what is being studied and help us understand the population our results might generalize to.

Treatments specify exactly what interventions are being compared, which is essential for interpreting any differences we observe.

Factors identify the variables we’re studying and whether we’re examining single factors or multiple factors simultaneously.

Levels tell us how many different conditions are being compared and help us understand the scope of the investigation.

Response Variable clarifies what outcomes we’re measuring and how success or failure will be determined.

Working through these questions systematically ensures we fully understand the study structure before attempting to interpret results or draw conclusions.

8.5.2. Example 1: Instructional Methods Study (Completely Randomized Design)

Research Context

Educational researchers want to understand how different instructional methods affect student learning outcomes. This represents a classic question in educational research: which teaching approach is most effective for helping students master new material?

Study Description

The researchers identify three instructional approaches they want to compare:

  • Self-study from textbook: Students learn independently using written materials

  • Traditional classroom instruction: Students attend lectures and participate in class discussions

  • Online instruction: Students complete computer-based modules with interactive content

They recruit a sample of 90 students who will all be learning the same material (let’s say introductory statistics concepts). The students are randomly assigned to the three groups, with 30 students in each group. After a specified learning period, all students take the same comprehensive exam, and the researchers compare average scores across the three instructional methods.

Design Analysis

1. What type of experimental design is used?

This is a Completely Randomized Design (CRD). The key evidence is that students are simply randomly assigned to one of the three treatment groups without any prior grouping or blocking. No student characteristics are used to form blocks, and there’s no pairing or matching process. The randomization is straightforward: each student has an equal chance of being assigned to any of the three groups.

2. What are the experimental units?

The 90 students serve as the experimental units. Since we’re dealing with human participants, they’re more specifically called subjects. These students represent the units to which treatments are applied and from which responses are measured.

3. What are the treatments?

The treatments are the application of the three different instructional methods:

  • Application of self-study learning approach

  • Application of traditional classroom instruction

  • Application of online instruction methodology

Note that the traditional classroom instruction could be considered a control group since it represents the standard or conventional approach against which the other methods are being compared.

4. What are the factors of interest?

There is one factor of interest: the type of instructional method. This is the variable the researchers want to study to see how it affects learning outcomes. While students likely differ on many characteristics (prior knowledge, motivation, learning style), the design doesn’t explicitly control for these—instead, randomization is used to distribute these characteristics roughly equally across groups.

5. How many levels do the factors have?

The instructional method factor has three levels:

  • Self-study from textbook

  • Traditional classroom instruction

  • Online instruction

6. What is the response variable?

The response variable is scores on the final exam. This provides a quantitative measure of how well students learned the material under each instructional approach.

https://yjjpfnblgtrogqvcjaon.supabase.co/storage/v1/object/public/stat-350-assets/images/chapter8/instructional_method_diagram.png

Fig. 8.4 Completely Randomized Design for Instructional Methods Study

Implementation Details

Randomization Process: The researchers would need to use a proper randomization procedure. They might:

  • Assign each student a number from 1 to 90

  • Use a random number generator to determine group assignments

  • Ensure equal group sizes (30 per group) through restricted randomization

Controlling Other Factors: Since this is a CRD, the researchers rely on randomization to balance other important variables like:

  • Students’ prior knowledge of statistics

  • General academic ability

  • Motivation and study habits

  • Learning preferences

Measuring the Response: All students would take the same exam under the same conditions to ensure fair comparison. The exam should be designed to measure the specific learning objectives covered in all three instructional approaches.

Advantages of CRD for This Study

  • Simplicity: Easy to implement and understand

  • Flexibility: Can accommodate different group sizes if needed

  • Broad applicability: Results can generalize to students similar to those in the study

  • Statistical analysis: Straightforward to analyze using standard methods

Potential Limitations

  • Baseline differences: Random assignment might not perfectly balance student characteristics, especially with moderate sample sizes

  • Unknown confounders: Important variables affecting learning might not be evenly distributed across groups

  • External factors: Differences in instructor quality, classroom environment, or timing might affect results

8.5.3. Example 2: Hand Cream Comparison (Matched Pairs Type 2)

Research Context

A cosmetics company wants to compare the effectiveness of two different hand cream formulations. Consumer satisfaction is highly subjective and varies greatly between individuals based on skin type, preferences, and expectations. This variability makes it difficult to detect treatment differences using simple comparisons between different groups of people.

Study Description

The researchers recruit 40 women to participate in a hand cream comparison study. The study uses a crossover design where each participant tries both hand creams in a randomized order:

Phase 1: Half the women (20) use Hand Cream Type 1 for one month, while the other half (20) use Hand Cream Type 2 for one month.

Washout Period: All participants stop using any hand cream for two weeks to allow the effects of the first cream to dissipate.

Phase 2: The groups switch—those who used Type 1 now use Type 2 for one month, and those who used Type 2 now use Type 1 for one month.

Measurement: At the end of each phase, participants complete satisfaction surveys. The researchers can then compare each woman’s satisfaction with Type 1 versus Type 2.

Design Analysis

1. What type of experimental design is used?

This is a Matched Pairs Design, Type 2 (also called a crossover design). The key identifying feature is that each participant serves as her own control by trying both treatments. Rather than comparing different groups of people, we’re comparing each person’s response to Treatment A versus Treatment B.

2. What are the experimental units?

The 40 women are the experimental units. Since they’re human participants, they’re specifically called subjects.

3. What are the treatments?

The treatments are the application of the two different hand cream formulations:

  • Application of Hand Cream Type 1

  • Application of Hand Cream Type 2

Each woman receives both treatments, just in different time periods and in randomized order.

4. What are the factors of interest?

There is one factor of interest: the type of hand cream. This is what the researchers want to study to understand its effect on user satisfaction.

5. How many levels do the factors have?

The hand cream factor has two levels:

  • Hand Cream Type 1 (Brand 1)

  • Hand Cream Type 2 (Brand 2)

6. What is the response variable?

The response variable is satisfaction score. This is likely measured through a standardized survey that produces a numerical score representing each participant’s satisfaction with the hand cream they used during that phase.

https://yjjpfnblgtrogqvcjaon.supabase.co/storage/v1/object/public/stat-350-assets/images/chapter8/hand_cream_crossover_diagram.png

Fig. 8.5 Matched Pairs Type 2 (Crossover) Design for Hand Cream Study

Implementation Details

Randomization of Order: The randomization determines which treatment each participant receives first:

  • Random assignment of 20 women to receive Type 1 first, then Type 2

  • The other 20 women receive Type 2 first, then Type 1

  • This controls for order effects that might influence satisfaction

Washout Period: The two-week break between treatments is crucial for this design:

  • Allows skin to return to baseline condition

  • Prevents carry-over effects where the first cream influences response to the second

  • Must be long enough for complete elimination of the first treatment’s effects

Measurement Timing: Satisfaction could be measured:

  • At the end of each treatment phase

  • At the end of the entire study with retrospective comparison

  • Multiple times during each phase to track changes over time

Advantages of Matched Pairs for This Study

Perfect Matching: Each woman is perfectly matched with herself, controlling for:

  • Skin type and condition

  • Personal preferences and expectations

  • Usage patterns and application technique

  • Environmental factors (weather, hand washing frequency)

High Statistical Power: Since we eliminate between-person variability, we can detect smaller differences between treatments with fewer participants.

Individual Comparisons: We get a direct comparison for each person, showing not just average differences but how consistently one treatment is preferred.

Critical Requirements for Success

Adequate Washout: If two weeks isn’t sufficient for the first cream’s effects to disappear, the design fails because treatments contaminate each other.

Stable Preferences: Participants’ underlying preferences and skin condition must remain relatively stable across the study period.

No Period Effects: The time of year, weather conditions, or other temporal factors shouldn’t systematically favor one treatment period over another.

Potential Limitations

Limited Generalizability: Results might not apply to:

  • People with very different skin types than study participants

  • Different usage patterns (frequency, amount applied)

  • Different environmental conditions

Dropout Problems: If participants leave the study after the first phase, we lose their data entirely since each person must complete both treatments.

Complex Analysis: Analyzing crossover data requires methods that account for the paired nature of observations and potential period effects.

8.5.4. Example 3: Dairy Cow Nutrition Study (Randomized Block Design)

Research Context

Agricultural researchers want to investigate whether the type of grain mix affects milk production in dairy cows. This research has clear practical importance for dairy farmers who want to optimize their feeding programs to maximize milk yield while controlling costs.

Study Description

The researchers plan to compare two different brands of grain mix to see which produces higher milk yields. They have access to 16 dairy cows from different farms, but these cows represent four different breeds. The breed of cow is known to significantly affect baseline milk production, with some breeds naturally producing much more milk than others.

Given this situation, the researchers implement a randomized block design:

Blocking: The 16 cows are first separated into four blocks based on their breed, with 4 cows from each breed in each block.

Randomization Within Blocks: Within each breed (block), 2 cows are randomly assigned to receive Grain Mix Brand 1, and the other 2 cows receive Grain Mix Brand 2.

Measurement: After a specified feeding period, milk production is measured for each cow, and the researchers compare average production between the two grain mix brands while accounting for breed differences.

Design Analysis

1. What type of experimental design is used?

This is a Randomized Block Design (RBD). The key identifying features are:

  • Experimental units are first grouped into blocks based on an important characteristic (breed)

  • Random assignment occurs within each block rather than across all units

  • The blocking variable (breed) is not the primary interest but is controlled to improve precision

2. What are the experimental units?

The 16 dairy cows serve as the experimental units. These are the subjects receiving the treatments and from whom responses are measured.

3. What are the treatments?

The treatments are feeding the cows different grain mix brands:

  • Feeding Grain Mix Brand 1

  • Feeding Grain Mix Brand 2

4. What are the factors of interest?

There are two factors in this study, but they serve different purposes:

Primary Factor of Interest: Type of grain mix - This is what the researchers want to study to understand its effect on milk production.

Blocking Factor: Breed of cow - This is an extraneous variable that affects milk production but is not the focus of the study. It’s controlled through blocking rather than being ignored or left to randomization alone.

5. How many levels do the factors have?

Grain Mix Factor: Two levels

  • Brand 1

  • Brand 2

Breed Factor: Four levels

  • Breed 1

  • Breed 2

  • Breed 3

  • Breed 4

6. What is the response variable?

The response variable is milk production, likely measured as the total volume of milk produced per cow over a specified time period (such as gallons per day or pounds per week).

https://yjjpfnblgtrogqvcjaon.supabase.co/storage/v1/object/public/stat-350-assets/images/chapter8/dairy_cow_rbd_diagram.png

Fig. 8.6 Randomized Block Design for Dairy Cow Nutrition Study

Implementation Details

Block Formation: The blocking process comes first and is not randomized:

  • Cows are grouped by breed based on their known characteristics

  • Each block contains cows that are similar with respect to the blocking variable

  • Block assignment is deterministic, not random

Randomization Within Blocks: Within each breed block:

  • 2 cows are randomly assigned to Grain Mix Brand 1

  • 2 cows are randomly assigned to Grain Mix Brand 2

  • This process is repeated independently for each of the 4 breed blocks

Why Blocking Is Essential Here

Large Breed Differences: Different cow breeds can have dramatically different milk production capabilities:

  • Holstein cows might average 6-7 gallons per day

  • Jersey cows might average 4-5 gallons per day

  • Other breeds might have different production levels

Small Sample Size: With only 16 cows total, random assignment alone might not balance breed distribution between treatment groups.

Improved Precision: By ensuring each treatment appears in each breed, we can separate grain mix effects from breed effects, making our comparison more precise.

Advantages of RBD for This Study

Controls Known Confounder: Breed effects are explicitly controlled rather than left to chance.

Increased Statistical Power: By removing breed variability from the error term, we can detect smaller grain mix effects.

Broader Generalizability: Results apply across all four breeds rather than being confounded with breed distribution.

Efficient Use of Resources: Achieves better precision than CRD with the same number of experimental units.

Analysis Implications

Two-Way Analysis: The statistical analysis will examine:

  • Main effect of grain mix (primary interest)

  • Main effect of breed (controlled factor)

  • Possible interaction between grain mix and breed

Within-Block Comparisons: The key comparisons are between grain mix brands within each breed, then averaged across breeds.

Increased Complexity: Analysis is more complex than CRD but provides more information and better control.

Critical Assumptions

Homogeneity Within Blocks: Cows within each breed should be as similar as possible on other factors affecting milk production.

No Interaction: Ideally, the grain mix effect should be similar across all breeds (though this can be tested).

Representative Breeds: The four breeds should represent the broader population of dairy cows to which results will be generalized.

8.5.5. Comparing the Three Designs

These three examples illustrate how different research contexts call for different design approaches, each with distinct advantages and limitations.

When to Use Each Design

Completely Randomized Design works best when:

  • Experimental units are relatively homogeneous

  • Sample size is large enough for randomization to balance unknown factors

  • No obvious blocking variables suggest themselves

  • Simplicity of implementation is important

Matched Pairs Design is optimal when:

  • Individual differences are very large relative to expected treatment effects

  • Treatments are reversible (for Type 2) or natural pairs exist (for Type 1)

  • Sample size is necessarily small

  • Maximum control is essential

Randomized Block Design is most appropriate when:

  • A known factor strongly influences the response but isn’t the primary interest

  • Sample size is moderate (too small for CRD alone, too large for matched pairs)

  • Multiple subgroups need representation in the results

  • Improved precision justifies increased complexity

Design Efficiency Comparison

For detecting the same treatment effect:

  • Matched Pairs typically requires the fewest experimental units

  • Randomized Block Design requires fewer units than CRD when blocking is effective

  • Completely Randomized Design might require the most units but is most flexible

Analysis Complexity

  • CRD: Simple one-way analysis comparing group means

  • Matched Pairs: Paired analysis focusing on within-unit differences

  • RBD: Two-way analysis accounting for both treatments and blocks

8.5.6. Practical Considerations in Design Selection

Real research situations rarely fit perfectly into textbook categories. Successful experimental design requires balancing statistical ideals with practical constraints.

Resource Constraints

Budget Limitations: More complex designs might require more resources for implementation and analysis.

Time Constraints: Crossover designs take longer than parallel group designs.

Personnel Requirements: Some designs require more training or specialized expertise.

Ethical Considerations

Risk-Benefit Analysis: Matched pairs designs might expose participants to more treatments.

Informed Consent: Participants must understand the full scope of their involvement.

Vulnerable Populations: Special protections might limit design options.

Logistical Factors

Recruitment Challenges: Some designs require specific types of participants (natural pairs, specific characteristics for blocking).

Retention Concerns: Longer studies face higher dropout rates.

Measurement Feasibility: Some response variables are easier to measure than others.

Balancing Competing Goals

Successful experimental design requires balancing:

  • Internal validity (strong causal inference) vs. external validity (generalizability)

  • Statistical power (ability to detect effects) vs. practical feasibility (resource constraints)

  • Precision (controlling variability) vs. simplicity (ease of implementation and analysis)

8.5.7. Bringing It All Together

The examples in this chapter demonstrate that there’s rarely a single “correct” design for any research question. Instead, good design involves carefully considering the research context, identifying the most important threats to validity, and choosing an approach that addresses those threats while remaining feasible to implement.

Key Takeaways 📝

  1. Systematic analysis is essential: Using the six key questions (design type, experimental units, treatments, factors, levels, response) provides a complete understanding of any experimental design.

  2. Context determines design choice: The instructional methods study used CRD because students were relatively homogeneous; the hand cream study used matched pairs because individual differences were large; the dairy study used RBD because breed was a known important factor.

  3. Each design addresses different challenges: CRD handles general variability through randomization; matched pairs controls individual differences through pairing; RBD controls known sources of variability through blocking.

  4. Trade-offs are unavoidable: Every design involves compromises between statistical power, practical feasibility, and experimental control.

  5. Implementation details matter: Proper randomization procedures, adequate washout periods, and appropriate blocking variables are crucial for design success.

  6. Analysis follows design: The experimental design directly determines which statistical methods are appropriate and how results should be interpreted.

  7. Real studies are messier than textbook examples: Practical constraints, ethical considerations, and resource limitations often require modifications to ideal designs.

Understanding experimental design through concrete examples provides the foundation for both conducting your own research and critically evaluating studies conducted by others. The goal is not to memorize design templates but to develop the judgment needed to recognize which approach is most appropriate for specific research contexts and how to implement chosen designs effectively.

As we move forward to examine sampling designs and potential biases, remember that experimental design and sampling design work together to determine the overall quality and interpretability of research findings. Good experimental design creates the framework for valid causal inference, while good sampling design ensures that results can be generalized to the populations we care about.

Exercises

  1. Design Identification: For each scenario below, identify which experimental design would be most appropriate and explain your reasoning:

    1. Testing whether three different fertilizers affect plant growth using 60 plants of the same species.

    2. Comparing two pain medications using 40 patients with chronic pain, where patients try both medications separated by a washout period.

    3. Studying the effect of teaching method on learning outcomes using students from three different grade levels.

    4. Comparing the effectiveness of two workout routines using identical twins as participants.

  2. Complete Design Analysis: A researcher wants to test whether background music affects concentration during studying. The study involves 48 college students who are randomly assigned to study in either silence, with classical music, or with pop music. After 2 hours of studying, all students take the same comprehension test.

    Answer all six design questions for this study and draw a diagram showing the design structure.

  3. Matched Pairs Feasibility: For the hand cream example, explain why a two-week washout period is crucial. What problems could arise if: a) The washout period were only 3 days? b) The washout period were 2 months? c) There were no washout period at all?

  4. Blocking Variable Selection: In the dairy cow study, breed was used as the blocking variable. Identify three other variables that might be good blocking variables for this study and explain why each would be appropriate.

  5. Design Modification: Suppose the instructional methods study (Example 1) could be modified to use a randomized block design. a) What variable would you use for blocking and why? b) How would this change the implementation of the study? c) What advantages and disadvantages would this create compared to the original CRD?

  6. Sample Size and Design: Explain why the dairy cow study needed to use RBD rather than matched pairs, even though individual differences between cows are probably quite large.

  7. Real-World Constraints: Choose one of the three examples from this chapter and identify three practical constraints that might make the design difficult to implement in practice. For each constraint, suggest a realistic modification that maintains the essential features of the design.