Slides 📊

8.1. Experimental and Sampling Designs

We’ve explored data through descriptive methods, built probability models to understand uncertainty, and learned how sample statistics behave through sampling distributions and the Central Limit Theorem. These tools have equipped us to understand how information flows from populations to samples. But before we can confidently make the reverse journey—inferring from samples back to populations—we need one final, essential piece: understanding how to collect data properly in the first place. This chapter focuses on thoughtful, principled approaches to study design and data collection. Statistical inference is powerful, but it’s only as reliable as the data it’s based on.

Road Map 🧭

Identify the characteristics of a statistical question.
Understand that data contains statistical variation arising from different sources, and that each source is treated differently in an experiment.
Recognize the different sources of data, along with their respective advantages and disadvantages.
Distinguish between observational and experimental studies.

8.1.1. Statistical Questions and the Need for Data

Our journey toward statistical inference begins with recognizing what makes a question statistical in nature.

Statistical vs. Deterministic Questions

Not every question involving data is statistical. If we can predict an outcome with certainty given the inputs—like calculating the area of a rectangle from its length and width—we’re dealing with a deterministic relationship. Statistical questions, by contrast, involve relationships where perfect prediction is impossible, but where we can still identify meaningful patterns and quantify uncertainty by analyzing data.

A data always comes with inherent variability. This variability isn’t always a flaw to be eliminated—it’s often the fundamental characteristic that makes statistical methods both necessary and powerful.

The Nature of Statistical Variation

Statistical variation arises from multiple sources.

Subject differences: Individual units in a study naturally vary in their characteristics, responses, and behaviors.
Measurement errors: Even the most precise instruments introduce some degree of measurement uncertainty.
Random chance: Inherent randomness in natural processes and human behavior.

Ideally, we want the primary source of variation in our data to be random chance, with the influence from other sources minimized through careful design. We aim to reduce measurement errors through precise instruments and standardized procedures, and we handle subject differences through proper randomization and control strategies.

8.1.2. The Spectrum of Data Sources

Before conducting any study, researchers face a fundamental decision: where will their data come from? This choice shapes everything that follows—the types of conclusions that can be drawn, the statistical methods that are appropriate, and the confidence they can place in their results. Understanding the characteristics, advantages, and limitations of different data sources is essential for making informed research decisions.

Anecdotal Data

Anecdotal data represents the most basic form of information—observations from personal experiences, casual reports, or informal accounts shared through news media and social networks. While lacking scientific rigor, anecdotal evidence plays a surprisingly important role in the research ecosystem by providing insights or raising hypotheses to be studied further.

Available Data

The modern research landscape is dominated by an unprecedented availability of existing data. Available data includes any information that has already been collected and can be accessed for research purposes, ranging from government statistics and published study datasets to corporate databases and social media archives.

However, we do not have control over the quality, accuracy, and completeness of an available data, which can impact the reliability of the results and insights derived from it. It is important to asseess the quality of the available data carefully by considering data sources, data collection methods, and data processing techniques used.

Collecting New Data

When available data is insufficient or inappropriate for answering our research question, new data must be collected. Studies involving new data collection can be classified into two major branches:

Observational studies
Experimental studies

In the remainder of this section, we will briefly describe the characteristics of each and highlight their differences. Because experimental studies require greater and more deliberate researcher intervention, their details will be discussed further in Sections 8.2 through 8.5.

8.1.3. Observational Studies

In an observational study, researchers act as careful observers rather than active manipulators. They identify subjects of interest, contact them, and collect measurements, but they do not impose treatments or attempt to influence the study environment. This approach is particularly valuable when interventions would be unethical, impractical, or impossible, or when the research goal is to understand naturally occurring phenomena.

The observational approach follows a systematic process:

Define the research question.
Identify the target population.
Specify variables of interest, including both the primary variables of focus and potential confounding variables that might influence the results.
Design and implement random sampling procedures to obtain a representative sample from the target population.
Observe and measure the variables of interest without intervention.
Apply statistical inference methods to draw conclusions about the broader population.

Strengths and Limitations of Observational Studies

Observational studies excel at documenting naturally occurring relationships and patterns. They allow researchers to study phenomena in realistic settings where all the complex factors that influence outcomes in the real world remain present. This ecological validity makes observational studies particularly valuable for understanding how variables relate in natural environments.

A key limitation, however, is the lack of control over variables and treatment assignments. Without the ability to regulate which factors are present or how they vary, researchers cannot isolate the effects of specific conditions. As a result, the influence of other, uncontrolled factors may be difficult to separate from the patterns of interest.

This limitation does not reduce the value of observational studies; it simply calls for careful interpretation. Patterns observed consistently across multiple well-designed observational studies can still provide strong insights into how phenomena unfold in realistic contexts.

Example 💡: The Case of “Feline High-rise Syndrome” 🐈

Consider the study of “feline high-rise syndrome” by Whitney and Mehlhaff, published in the Journal of the American Veterinary Medical Association in 1987. The researchers wondered: when cats fall from buildings, how does the height of the fall relate to the severity of their injuries? To investigate this question, they identified 132 cats that had been brought to the Animal Medical Center in New York City after falling from multi-story buildings between June and November 1984. For each case, they carefully documented the cat’s injuries, the height from which it fell, and the outcome of treatment.

Their findings were surprising. About 90% of the cats survived their falls with appropriate veterinary care. But more intriguingly, they observed that cats falling from seven stories or higher didn’t sustain significantly more injuries than those falling from lower heights. In fact, cats falling from very high stories (nine floors or more) showed remarkably few limb fractures compared to those falling from intermediate heights.

The researchers proposed what they called the “terminal velocity hypothesis” to explain this pattern. They theorized that cats reach their maximum falling speed after about five stories. Once they achieve this terminal velocity and realize they’re in for a long fall, cats may relax into a “flying squirrel” posture that distributes impact forces more evenly across their body, reducing the likelihood of concentrated injuries.

Why This Had to Be Observational

This study illustrates why observational research is sometimes the only ethical option. Testing the terminal velocity hypothesis experimentally would require deliberately dropping cats from various heights—an approach that would be both unethical and illegal. Even if researchers could design some sort of controlled falling scenario with safety nets or other protections, such an artificial setup would fundamentally change the phenomenon being studied. Instead, the researchers had to rely on cats’ own decisions to fall from high ledges, windowsills, and fire escapes.

8.1.4. Experimental Studies

When ethical and practical constraints allow, experimental studies offer the strongest framework for statistical investigation. In contrast to observational studies, experiments involve deliberate control over one or more variables, including the ability to assign treatments or conditions to subjects according to a planned design. This control enables researchers to minimize the influence of extraneous factors and ensure that differences in outcomes can be more confidently attributed to the conditions under study.

The following sections will expand on this topic in greater detail.

8.1.5. Bringing It All Together

Key Takeaways 📝

Statistical questions require data with inherent variability and seek to quantify relationships among variables.
Data sources vary in quality and appropriateness. Anecdotal data provides inspiration but not evidence; available data offers efficiency but requires careful quality assessment; new data collection provides control but demands resources.
Observational studies are valuable for studying naturally occurring phenomena where intervention is impossible or unethical.
Experimental studies are appropriate when variables and environmental factors can be actively controlled.
The choice between observational and experimental approaches depends on research goals, ethical considerations, and practical constraints.
Study design determines the scope of valid conclusions.

8.1.6. Exercises

These exercises develop your understanding of statistical questions, data sources, and the distinction between observational and experimental studies.

Key Concepts

Statistical vs. Deterministic Questions

Deterministic: Outcomes can be predicted with certainty given inputs (e.g., calculating area from length × width)
Statistical: Perfect prediction is impossible; we identify patterns and quantify uncertainty using data

Sources of Statistical Variation

Subject differences (natural variation among individuals)
Measurement errors (instrument precision)
Random chance (inherent randomness)

Data Sources (from weakest to strongest evidence)

Anecdotal data: Personal experiences, informal accounts — useful for generating hypotheses
Available data: Pre-existing datasets — efficient but quality not controlled
New data collection: Observational or experimental studies

Study Types

Observational study: Researchers observe without intervention; cannot establish causation
Experimental study: Researchers actively assign treatments; can establish causation

Common Student Error ⚠️

A statistical question is NOT defined by “containing probability” — it is defined by variability in the answer across units or repetitions. The question “What is the probability of heads?” is deterministic (answer: 0.5). The question “How many heads will I get in 10 flips?” is statistical (answer varies each time you flip).

Exercise 1: Statistical vs. Deterministic Questions

Classify each question as statistical or deterministic, and explain your reasoning.

What is the fuel efficiency (mpg) of a 2024 Toyota Camry traveling at 60 mph?
How many lines of code are in the file main.py if it contains 45 functions averaging 12 lines each?
Does caffeine consumption affect reaction time in software developers?
What is the probability that a randomly selected engineering student has an internship?
How long will it take to transfer a 500 MB file over a 100 Mbps connection?
Do mechanical engineering students score higher on the FE exam than civil engineering students?

Exercise 2: Sources of Statistical Variation

For each scenario, identify which source(s) of variation are most prominent: subject differences, measurement error, or random chance.

Blood pressure readings taken from the same patient at different times of day show different values.
Two engineers using the same stress testing equipment on identical steel samples get slightly different yield strength measurements.
Customer arrival times at a help desk vary unpredictably throughout the day.
Students taking the same exam under identical conditions receive widely different scores.
A quality control sensor occasionally misreads the diameter of manufactured parts.

Exercise 3: Evaluating Data Sources

A tech company wants to understand whether their new IDE (integrated development environment) improves programmer productivity.

For each proposed data source below, identify the type of data source and discuss its strengths and limitations.

A senior developer shares that “the new IDE feels much faster” and mentions a colleague who “finished a project ahead of schedule after switching.”
The company analyzes Git commit data from the past year, comparing commits per developer before and after the IDE was released.
The company recruits 100 programmers and randomly assigns 50 to use the new IDE and 50 to continue with the old IDE for a month, then compares lines of code produced.

Exercise 4: Observational vs. Experimental Studies

For each research question, determine whether an observational or experimental study would be more appropriate. Explain your reasoning, considering ethical and practical constraints.

Does wearing a motorcycle helmet reduce the severity of head injuries in accidents?
Does a new compiler optimization flag improve code execution speed?
Are children who play video games more likely to have attention problems?
Does a new drug reduce blood pressure more effectively than the current standard treatment?
Do employees who work remotely report higher job satisfaction?

Exercise 5: The Feline High-Rise Study Revisited

The chapter describes a study of “feline high-rise syndrome” where researchers examined cats brought to a veterinary hospital after falling from buildings.

Why was this necessarily an observational study?
The researchers found that cats falling from 7+ stories didn’t sustain more injuries than those falling from lower heights. What alternative explanations (besides the “terminal velocity hypothesis”) might account for this finding?
What is a key limitation of using only cats brought to the veterinary hospital as the data source?
If you wanted to gather additional evidence about the relationship between fall height and injury severity, what other data sources might you consider? What would be their advantages and limitations?

Exercise 6: Study Type Identification

Classify each study as observational or experimental, and identify a potential confounding factor that could affect the conclusions.

Researchers compare the GPA of students who use the campus tutoring center versus those who don’t.
A pharmaceutical company randomly assigns patients to receive either a new antidepressant or a placebo, then measures symptom improvement after 8 weeks.
A tech company compares bug rates between teams that use agile methodology versus waterfall methodology.
Agronomists plant corn seeds in 50 plots, randomly assigning each plot to receive either a new fertilizer or no fertilizer, then measure yields.
Epidemiologists track a cohort of 10,000 adults over 20 years, recording their exercise habits and eventual health outcomes.

8.1.7. Additional Practice Problems

True/False Questions

A question about the average starting salary of computer science graduates is a deterministic question because salaries are fixed numbers.
Measurement error can be completely eliminated by using high-quality instruments.
Anecdotal evidence is useful for generating hypotheses but should not be used as the primary basis for important decisions.
In an observational study, researchers can establish cause-and-effect relationships by controlling for all confounding variables.
Available data is always inferior to newly collected data for research purposes.
An experimental study requires that researchers actively manipulate at least one variable.

Multiple Choice Questions

Which of the following is a statistical question?

Ⓐ What is the boiling point of water at sea level?

Ⓑ How many credits are required to graduate with a BS in Engineering?

Ⓒ Do students who sit in the front of the classroom earn higher grades?

Ⓓ What is the sum of the first 100 positive integers?
A researcher studies whether coffee consumption is associated with heart disease by surveying 5,000 adults about their coffee habits and medical history. This is:

Ⓐ An experimental study because coffee consumption is the treatment

Ⓑ An observational study because participants choose their own coffee consumption

Ⓒ An experimental study because the researcher is collecting new data

Ⓓ Neither; this is anecdotal data
Which data source provides the strongest evidence for establishing causation?

Ⓐ A large database of historical records

Ⓑ Expert opinions and professional experience

Ⓒ A randomized controlled experiment

Ⓓ A carefully designed observational study
The main reason observational studies cannot establish causation is:

Ⓐ They typically have smaller sample sizes

Ⓑ Unmeasured confounding variables may explain observed associations

Ⓒ The data quality is usually poor

Ⓓ Researchers are unable to measure the variables accurately