8.4. Addressing Potential Flaws in Experimental Design
Understanding the three fundamental principles of experimental design provides the foundation for conducting rigorous research, but implementing these principles perfectly in real-world settings is often impossible. The difference between a good experiment and a great one often lies not in eliminating all potential problems but in recognizing where problems might arise and taking steps to minimize their impact.
Road Map 🧭
Understand that the three principles of experimental design provide guidelines for ideal experimental settings, but it is usually impossible to uphold them perfectly.
Learn different types of flaws that can arise in an experiment and how to address each one.
8.4.1. The Nature of Bias in Experimental Design
Bias in experimental design refers to systematic errors that cause the results to deviate from the truth in a consistent direction. Unlike random error, which varies unpredictably around the true value, bias consistently pushes our results away from reality in a systematic way. This distinction is crucial because while we can reduce random error through replication, bias cannot be eliminated simply by collecting more data.
Bias is dangerous because it often goes undetected. Random variation is visible in our data—we can see that individual observations vary around some central tendency. But systematic bias can masquerade as real effects, leading us to conclude that treatments work when they don’t, or that they don’t work when they actually do.
Let us examine three types of bias that can occur during an experiment: selection bias, measurement bias, and confounding bias.
A. Selection Bias: When Groups Are Not Comparable
Selection bias occurs when experimental units are assigned to treatment groups in ways that create fundamental differences between groups that are not due to the treatments themselves.
Example💡: Medical Study with Flawed Assignment
Imagine a study testing a new medication where researchers unconsciously assign sicker patients to the treatment group, hoping to help them, while healthier patients end up in the control group. Even if the medication has no effect, the treatment group might show more improvement simply because sicker patients have more room for improvement. Conversely, a truly effective medication might appear ineffective if the treatment group starts out much sicker than the control group.
B. Measurement Bias: When Observations Are Systematically Distorted
Measurement bias refers to systematic errors in how we collect, record, or process our data.
Example 💡: The Interns
Consider a medical experiment studying the effects of different treatments on blood chemistry. The study design calls for blood samples to be drawn from participants at regular intervals and analyzed in the laboratory. To ensure consistent procedures, the research team assigns specific personnel to work with specific treatment groups:
Intern A always draws blood from Group 1 participants
Intern B works with Group 2
Intern C handles Group 3
suppose Intern A is new and occasionally contaminates blood samples through improper technique. This contamination might systematically alter the laboratory results for Group 1, making their blood chemistry appear different from the other groups.
C. Confounding Bias: The Problem of Unmeasured Influences
Confounding bias occurs when an extraneous variable is related to both our treatment factors and our response variable, but we fail to control for or block this variable in our design.
A confounding variable has three key characteristics:
It influences the response variable we’re trying to measure.
It’s associated with treatment assignment or treatment groups.
The variation it causes is not addressed by the design, either through randomization or blocking.
Example 💡: Exercise and Heart Health
Suppose we want to study whether a new exercise program reduces heart disease risk. We recruit volunteers and randomly assign them to either participate in the exercise program or continue their normal routine. After six months, we find that the exercise group has better cardiovascular health markers.
However, imagine that we failed to account for dietary habits. If people who volunteer for exercise programs also tend to eat healthier diets, then diet becomes a confounding variable. The improved cardiovascular health might be due to:
The exercise program (what we want to conclude)
The healthier diets (confounding)
Both exercise and diet together
Neither (other unmeasured factors)
Without controlling for diet, we cannot determine which explanation is correct.
How to Minimize Bias
The primary defense against bias is rigorous randomization with proper concealment of assignment sequences. Additionally, baseline characteristics should be carefully monitored to verify that randomization has achieved its intended goal. Input from domain experts is essential for identifying procedural flaws and recognizing well-known confounding variables.
8.4.2. Lack of Realism: The Challenge of External Validity
Lack of realism represents a fundamental tension in experimental design between internal validity (our ability to draw valid causal conclusions within our study) and external validity (our ability to generalize those conclusions to real-world settings). This issue arises when our experimental units, treatments, or study settings fail to adequately represent the conditions we ultimately want to understand.
To understand how lack of realism can compromise experimental conclusions, consider a classic example from psychological research.
Example 💡: The Workplace Layoff Study
Suppose researchers want to study how layoffs at a workplace affect the morale of workers who remain on the job—a question with obvious practical importance for understanding organizational behavior and employee well-being.
The Ethical Constraint
The most direct approach would be to conduct a true experiment: approach various employers and ask them to randomly lay off some employees so researchers can observe the effects on remaining workers.
However, this approach is completely unethical. Deliberately causing people to lose their jobs for research purposes would cause real harm to participants and their families. No institutional review board would approve such a study, and no ethical researcher would propose it.
The Compromised Solution
Faced with this ethical constraint, researchers might design an alternative study using college students as experimental units. The study design might work as follows:
Recruit college students to participate in a temporary job proofreading textbooks.
Create a realistic work environment with multiple students working together.
Randomly assign some students to be “laid off” during the study (with their knowledge and consent).
Monitor the remaining students and measure their morale and productivity.
Administer psychological surveys to assess the impact of witnessing their colleagues being dismissed.
This design maintains the three fundamental principles—it includes control groups (students who don’t witness layoffs), uses randomization (to determine who gets “laid off”), and incorporates replication (multiple students in each condition).
Why This Study Lacks Realism
Despite adhering to sound experimental principles, this study suffers from serious limitations in realism, compromising its external validity. The job in this experimental setting carries different stakes and social dynamics than those in a real workplace.
Strategies for Maximizing Realism
While perfect realism is often impossible, researchers can take steps to maximize external validity. For example, they can
Allocate sufficient resources to collect a sample representative of the target population and to create realistic experimental settings.
Conduct multiple studies addressing different aspects of potential realism limitations.
Interpret the results carefully to clearly acknowledge the limitations in generalizing the findings.
8.4.3. The Significance of Generalization
Our efforts to minimize the flaws mentioned above ultimately lead to one key goal: making the experiment’s results generalizable. Generalization refers to the ability to apply research findings to broader populations, environments, or contexts that were not directly studied. Although many details were discussed, remember that good experimental practice boils down to this simple rule:
Control What You Can
Block What You Can’t Control
Randomize to Create Comparable Groups
Ensure Sufficient Replication
8.4.4. Bringing it All Together
Key Takeaways 📝
Bias systematically distorts results in consistent directions, making it more dangerous than random error because it cannot be reduced through larger sample sizes. There are three major types of experimental bias: selection bias, meaurement bias, and confounding bias.
Lack of realism represents the trade-off between internal validity (experimental control) and external validity (generalizability to real-world conditions).
Perfect studies are impossible: The goal is not to eliminate all potential problems but to minimize them systematically while maintaining study feasibility.
The simple rule provides practical guidance: Control what you can, block what you can’t control, randomize to create comparable groups, and ensure sufficient replication.
Exercises
1. Identifying Types of Bias: For each scenario below, identify the primary type of bias present and explain how it could affect the study conclusions:
A study of a new teaching method where more motivated teachers volunteer to use the new method.
A medical trial where nurses measuring patient recovery know which patients received the experimental treatment.
A study of exercise and mental health that fails to account for participants’ baseline fitness levels.
A psychology experiment using only college students to study workplace stress management.
The Intern Example: Redesign the blood chemistry study to minimize measurement bias while maintaining practical feasibility. Explain your approach and any remaining limitations.
Confounding Variables: A researcher finds that students who take music lessons have higher math scores. Identify three potential confounding variables that could explain this association and explain how each one might work.
Lack of Realism Analysis: Consider the workplace layoff study discussed in this chapter. Suggest three specific modifications to the study design that could increase realism while maintaining ethical standards. Discuss the trade-offs involved in each modification.
Bias Prevention Strategy: You’re designing a study to test whether a new app helps people stick to exercise routines. For each type of bias discussed in this chapter, explain:
How it might manifest in your study.
What specific steps you would take to minimize it.
Any remaining limitations you couldn’t completely eliminate.