1.2. Probability & Statistical Inference: How Are They Associated?
This course can be split into two main parts. The first half considers probability, and the second half focuses on statistical inference. In this section, we discuss how these two broad concepts are different yet closely associated. We achieve this goal by expressing each with a one-sided arrow between population and sample. We begin by defining what population and sample are.
Road Map 🧭
Define population and sample, and use their concepts to distinguish between probability and statistical inference.
1.2.1. Population and Sample
Population represents the entire collection of individuals or objects considered in our study. We condense information about the population into a probabilistic model that describes the behavior of attributes with some kind of smooth functional relationship.
Sample is just a subset of the entire population—a small selection of individuals or objects taken from the complete collection.
1.2.2. The “Two Arrows” Diagram

Fig. 1.5 The “two arrows” diagram
We express probability and statistical inference using one-sided arrows between sample and population. In each case, we begin with the complete knowledge of the side from which the arrow originates. Using this knowledge, we try to make an informed guess about the side the arrow points to.
Probability: Population → Sample
What does it mean to have complete knowledge of the population? It means that we are aware of the rule by which the population generates random samples. We call this rule the probability distribution.
With respect to Fig. 1.5, for example, suppose we know
the set of possible outcomes (cat, dog, and ghost),
the count of each category in the population, and
that each individual has an equal chance of being selected into a sample and are independent of the anothers.
Then we can calculate the probability of observing any specific outcome in a sample from this population. Examples of questions we can answer include:
Suppose a single individual is randomly sampled. What is the probability that this is a cat?
Suppose a random sample of size 10 is taken. What is the proability that this sample consists of nine dogs and a cat?
…
In the case of Fig. 1.5, the probability distribution is described through a combination of graphics and words. In other cases, a probability distribution can be expressed as a table of possible values with their associated probabilities or, in more complex scenarios, as a function.
By the end of Chapter 6, we will have built enough probabilistic language to address these different types.
Example 💡: Population → Sample

Suppose we know that 36% of all Americans have a passport.
Q1. What is the probability that a randomly sampled American does not hold a passport?
Probabilities are always expressed in a 0-1 scale. The probability would be \(1-0.36 = 0.64\).
Q2. If we take a random sample of 20 Americans, what’s the probability that exactly 10 of them will have a passport?
Under the assumption that all Americans have an equal chance of being sampled and that their inclusion in the sample does not impact the status of others, the probability is
\({{20}\choose{10}} (0.36)^{10}(0.64)^{10} = 0.0779\)
‼️ At this stage, you do not need to fully follow how these numbers are computed. The takeaway from this example should be the understanding that knowing the probability distribution allows us to compute the likelihood of various events that can happen in a sample. If you want to understand the computational steps, revisit after Chapter 5.
Statistical Inference: Population ← Sample
Statistical inference begins with full access to a dataset – a sinlge random sample from the population – but limited knowledge of the population’s probability distribution. It aims to understand the population through the sample while accounting for its limitations through quantifying the uncertainty.
In Fig. 1.5, suppose we can only see the sample. From this, we would guess
that the population contains cats, dogs, and ghosts,
that there are fewer ghosts than cats or dogs, and
that cats and dogs take up similar proportions
However, we would not be able to rule out the possibility that the population contains a wider range of categories, and we would not be strongly convinced about the true proportions of the observed categories considering how small the sample is.
Example 💡: Population ← Sample

Suppose we interview 20 Americans at random and find that 8 of them (40%) hold a passport.
Q1. What can we conclude about the percentage of all Americans who hold a passport?
If we were to make a point estimation, the best we could say is that around 40% of all Americans are estimated to hold a passport. In most cases, we would continue by numerically expresses the degree of uncertainty in this quantity through further inference methods.
We are not yet ready to formally perform uncertainty quantification, but we can begin to think about how different components might affect our confidence.
Q2. If all participants were interviewed at the entrance of Indianapolis International Airport, does the representativeness of the sample for all Americans increase or decrease? How does this impact the credibility of the 40% estimate?
It is reasonable to believe that individuals with passports are over-represented at this location. This would reduce the credibility that of the 40% estimate as representative of all Americans.
Q3. If a sample was larger than 20, while the percentage remained 40%, would our uncertainty increase or decrease?
By taking a larger sample, the sample becomes more representative of the population (imagine taking a “sample” as large as the entire population of the United States!). As a result, the uncertainty in the estimate would decrease.
Q4. Can you think of any other aspects of the problem that might affect the degree of uncertainty?
This quetsion is left for you to explore on your own.
1.2.3. Looking Ahead
Throughout this course, you are encouraged to come back to these two fundamental directions of reasoning. Understanding when you’re doing probability calculations versus statistical inference will clarify the methods and implications of each scenario.
Quick Check ✔
When would a sample ever be preferable to a census of the entire population?
Can you describe the different between probability and statistics in a few sentences?
Why can two different random samples yield different conclusions even when drawn from the same population?