5.1. Discrete Random Variables and Probability Mass Distributions
In previous chapters, we used set theory to describe events and their probabilities. While this approach provides a rigorous foundation, it can become cumbersome when dealing with complex scenarios. Random variables offer a more elegant solution by mapping outcomes directly to numbers.
Road Map 🧭
Define random variables as functions that map real-life events of arbitrary complexity to their numerical representations.
Distinguish between discrete and continuous random variables.
Formalize probability mass functions (PMFs) for discrete random variables.
Apply PMFs to calculate probabilities for complex events.
5.1.1. Random Variables: From Sets to Numbers
Definition
A random variable (RV) \(X\) is a function that maps each outcome in the sample space \(\omega \in \Omega\) to a numerical value. Formally, \(X: \Omega \to \mathbb{R}\).
Why is a random variable needed?
Outcomes of random experiments are often multi-faceted and tend to introduce more complexity than necessary. For example, suppose we flip a coin 10 times and count how many heads appear in the sequence. The complete sample space of ten coin flips contains \(2^{10} = 1,024\) different possible sequences.
However, if we’re only interested in the total number of heads, we do not need to examine each sequence individually. For instance, instead of interpreting ‘HHHHHHHHTH’ as a unique sequence, we can view it simply as an outcome that yields the numerical value 9.
This is where a random variable becomes useful. We can define a random variable, say \(X\), to map the outcome ‘HHHHHHHHTH’ to a numerical value that reflects the focus of our interest:
and all other 1,023 outcomes in a similar manner. By using a random variable, we reduced our focus from 1,024 sequences to just 11 possible values (0 through 10).
Expressing events with random variables
One of the key advantages of introducing a random variable is conciseness. Once an appropriate random variable is defined, most events can be expressed as equalities or inequalities involving the variable. See the table below for some examples:
Description |
Using set notations |
Using random variable \(X\) |
---|---|---|
Event that there are three heads in the sequence |
Define \(A_3\) as the name of the event. List all sequences with three heads in \(A_3 = \{\cdots\}\). |
\(X=3\) |
Event that there are more than 7 heads in the sequence |
Define \(A_8, A_9, A_{10}\) as the events of sequences with 8, 9, and 10 heads, respectively. The event of interest is \(A_8 \cup A_9 \cup A_{10}\). |
\(X > 7\) |
We no longer need to define a new event for every new question. Instead, we can express various situations compactly using the random variable \(X\).
5.1.2. Types of Random Variables
Random variables fall into two main categories based on the nature of their possible values:
Discrete Random Variables
A random variable is discrete if it can take on only a countable number of possible values. Discrete random variables typically arise when counting things, such as:
The number of heads in coin flips
The number of times someone swipes right out of 100 profile views
The number of website hits during a specific time period
The number of customers until the first big-ticket item is sold
Continuous Random Variables
A random variable is continuous if it can take on any value within a continuous range or interval. Continuous random variables typically arise when measuring quantities, such as:
Height, weight, or other physical measurements
Time until a particular event occurs
Temperature, pressure, or other environmental measurements
5.1.3. Probability Distributions
To describe the probabilistic behavior of a random variable, we must specify the probabilities associated with all its possible values. This complete description is called a probability distribution.
Discrete and continuous random variables have different types of probability distributions. Discrete random variables is described by a probability mass function (PMF), while a continuous random variables is described by a probability density function (PDF).
In this chapter, we will focus on discrete random variables and PMFs. As we progress through the course, we will see how PMFs and PDFs share some foundational ideas, while differing in important ways.
5.1.4. Probability Mass Functions
Definition
The probability mass function of a discrete random variable \(X\) is denoted by \(p_X\). For each possible value \(x\) that \(X\) can take, it gives
Different forms of a PMF
A PMF can be represented in several different forms:
A PMF can be organized into a table by listing the possible values with their corresponding probabilities.
Fig. 5.1 Exapmle of a PMF in table form
Dot plots or bar graphs that display the probability of each possible value can serve as visual representations of a PMF. However, they are not typically used on their own, as it can be difficult to determine exact probabilities unless the plot is very simple.
For some special random variables, a mathematical formula is used to describe the PMF. For example,
\[p_X(x) = \frac{e^{-\lambda} \lambda^x}{x!}, \text{ for } x \geq 0\]is a PMF.
Support
The support of a discrete random variable is the set of all possible values that have a positive probability:
Validity of a PMF
For a probability mass function to be valid, the following conditions must be satisfied:
Non-negativity
For all \(x\), \(0 \leq p_X(x) \leq 1.\)
Total probability of 1
The sum of probabilities over all values in the support must equal 1:
\[\sum_{x \in \text{supp}(X)} p_X(x) = 1\]
5.1.5. Important Types of Problems Involving PMFs
A. Constructing a PMF from Scracth
It is an important skill for statisticians to be able to “translate” descriptions of a random experiment in plain language to mathematical language involving a random variable and its PMF.
Example💡: Flipping a Biased Coin
Let us try constructing a PMF from scratch, only using descriptions of the experimental setting.
Suppose we flip a biased coin four times, where the probability of heads on each flip is 0.7 (and tails is 0.3). We define a random variable H to count the number of heads in the four flips. Find the complete PMF for H. Verify that the PMF is valid.
First, let’s identify the sample space. There are \(2^4 = 16\) possible sequences of heads and tails over four flips. However, rather than working with all 16 sequences individually, we can group them based on the number of heads:
H = 0: Only one sequence has zero heads (all tails: TTTT)
H = 1: Four sequences have exactly one head (HTTT, THTT, TTHT, TTTH)
H = 2: Six sequences have exactly two heads
H = 3: Four sequences have exactly three heads
H = 4: Only one sequence has all four heads (HHHH)
Using the independence of the coin flips and the given probabilities,
P(H = 0) = P(TTTT) = (0.3)⁴ = 0.0081
P(H = 1) = 4 × (0.3)³ × (0.7) = 0.0756
P(H = 2) = 6 × (0.3)² × (0.7)² = 0.2646
P(H = 3) = 4 × (0.3) × (0.7)³ = 0.4116
P(H = 4) = (0.7)⁴ = 0.2401
All probabilities are betwen 0 and 1, satisfying the first condition for validity. The probabilities also sum to 1: 0.0081 + 0.0756 + 0.2646 + 0.4116 + 0.2401 = 1. This gives us the complete PMF for our random variable H.

Fig. 5.2 Probability mass function for the number of heads in four flips

Fig. 5.3 Visualization of the PMF for the number of heads in four flips
The PMF reveals that getting three heads is the most likely outcome, with a probability of approximately 0.41, while getting zero heads is very unlikely, with a probability of only about 0.008.
B. Completing a Partially Known PMF
Completing a partially specified PMF is a common task in statistics. Typical scenarios include:
The probability is unknown for one value in the support.
Multiple probabilities are unknown, with additional constraints provided.
The coefficient \(k\) that turns a non-negative function \(f(x)\) into a valid PMF \(p_X(x) = kf(x)\) is unknown. This constant \(k\) is called the normalization constant.
In all such cases, we must “fill in the blanks” by applying the conditions of a valid PMF.
Example💡: Finding the normalization constant
Consider a potential PMF:
To make this a valid PMF, we need to find the value of k that ensures the probabilities sum to 1:
Multiplying both sides by 64 and solving for \(k\),
Therefore, the valid PMF is:
C. Calculating Probabilities with PMFs
Once we have a complete PMF, we can calculate probabilities for various events related to the random variable.
Viewing events as equalities and inequalities involving a random variable, we can express probablities of unions, intersections, and complements concisely in terms of \(p_X(x)\). Let us first get some practice writing proability statements correctly in terms of \(X\).
Example: Consider a random variable \(X\) which has the set of positive integers as its support.
Probability statements for discrete RVs |
||
---|---|---|
Description |
Expresssion in terms of \(p_X(x)\) |
Comment |
Probability that X is less than 4 |
\[\begin{split}&P(X < 4) \\
&= P(X=1 \text{ OR } X=2 \text{ OR } X=3) \\
&= P(X=1 \cup X=2 \cup X=3)\\
&= P(X=1) + P(X=2) + P(X=3)\\
&= p_X(1) + p_X(2) + p_X(3)\end{split}\]
|
The transition from the third to the fourth line works because the events \(\{X=x\}\) are disjoint for different values of \(x.\) |
Probability that X is less than 4 and at least 2 |
\[\begin{split}&P(X < 4 \cap X \geq 2)\\
&= P(2 \leq X < 4)\\
& = p_X(2) + p_X(3)\end{split}\]
|
For intersections and unions of non-disjoint events, think of ways to combine the two separate (in)equalities into one. |
Probability that X is at least than 4 or greater than 6 |
\[\begin{split}&P(X4 \geq \cup X>6) \\
&= P(X \geq 4) \\
&= 1 - P(X < 4)\end{split}\]
|
To compute \(P(X \geq 4)\) directly, we would have to sum infinitely many terms. Using the complement rule simplifies computation. |
Now, let us apply these skills to solve a problem.
Example💡: Computing probabilities using PMF
Using the PMF we just derived, let’s calculate some probabilities.
The probability that X is even:
\[\begin{split}P(X \text{ is even}) &= P(X = 0) + P(X = 2) + P(X = 4) + P(X = 6) \\ &= 1/4 + 1/8 + 1/8 + 1/16 = 9/16\end{split}\]The probability that X is greater than 3:
\[\begin{split}P(X > 3) &= P(X = 4) + P(X = 5) + P(X = 6) \\ &= 1/8 + 1/16 + 1/16 = 1/4\end{split}\]Are the events “X = 5 or X = 6” and “X > 3” independent?
To show independence between two events \(A\) and \(B\), we must show that they meet the definition of idependence. That is, we must show \(P(A|B) = P(A)\) or \(P(B|A)P(A).\)
\[\begin{split}P(X = 5 \text{ or } X = 6 | X > 3) &= \frac{P((X = 5 \cup X = 6) \cap (X > 3))}{P(X > 3)}\\ &= \frac{P(X = 5 \cup X = 6)}{P(X > 3)}\\ &= (1/16 + 1/16)/(1/4) = 1/2\\ P(X = 5 \cup X = 6) &= 1/16 + 1/16 = 1/8\end{split}\]Since 1/2 ≠ 1/8, these events are not independent.
5.1.6. Bringing It All Together
Key Takeaways 📝
Random variables map outcomes from the sample space to numerical values, allowing us to focus on quantities of interest rather than complex sets.
Discrete random variables take on countable values and are typically used when counting things, while continuous random variables can take any value in a continuum and are used for measurements.
A probability mass function (PMF) specifies the probability that a discrete random variable equals each possible value in its support.
Valid PMFs must satisfy two conditions:
all probabilities are between 0 and 1, and
the sum of all probabilities equals 1.
We can calculate probabilities for various events by rewriting the probability statements in terms of the PMF.
Exercises
Terminology Check: Explain the difference between a discrete and a continuous random variable. Give two examples of each that were not mentioned in the chapter.
Dice Sum: Two fair dice are rolled. Let X be the random variable that represents the sum of the two values.
What is the support of X?
Construct the PMF for X.
Find P(X is odd).
Find P(X > 8).
Card Draw: A card is drawn randomly from a standard deck. Define the random variable X as follows:
X = 1 if the card is an ace
X = 11 if the card is a face card (jack, queen, or king)
X = the number on the card for all other cards (2 through 10)
Construct the PMF for X.
What is P(X ≥ 5)?
Find P(X = 11 | X > 5).
PMF Validation: Determine if the following functions are valid PMFs for a discrete random variable X. If not, explain why.
p_X(x) = 0.2 for x = 1, 2, 3, 4, 5
p_X(x) = x/15 for x = 1, 2, 3, 4, 5
p_X(x) = 1/(x+1) for x = 1, 2, 3, 4
Independence Check: For the biased coin example in the chapter (with P(Heads) = 0.7), let H be the random variable counting the number of heads in four flips.
Are the events “H is odd” and “H > 2” independent? Show your work.
Find two other events defined in terms of H that are independent.