.. _5-1-discrete-rvs-and-pmfs: .. raw:: html
Discrete Random Variables and Probability Mass Distributions ============================================================================ In previous chapters, we used set theory to describe events and their probabilities. While this approach provides a rigorous foundation, it can become cumbersome when dealing with complex scenarios. Random variables offer a more elegant solution by mapping outcomes directly to numbers. .. admonition:: Road Map 🧭 :class: important * Define **random variables** as functions that map real-life events of arbitrary complexity to their numerical representations. * Distinguish between **discrete** and **continuous** random variables. * Formalize **probability mass functions** (PMFs) for discrete random variables. * Apply PMFs to calculate probabilities for complex events. Random Variables: From Sets to Numbers ----------------------------------------- Definition ~~~~~~~~~~~~~ A random variable (RV) :math:`X` is a function that maps each outcome in the sample space :math:`\omega \in \Omega` to a numerical value. Formally, :math:`X: \Omega \to \mathbb{R}`. Why is a random variable needed? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Outcomes of random experiments are often multi-faceted and tend to introduce more complexity than necessary. For example, suppose we flip a coin 10 times and count how many heads appear in the sequence. The complete sample space of ten coin flips contains :math:`2^{10} = 1,024` different possible sequences. However, if we're only interested in the total number of heads, we do not need to examine each sequence individually. For instance, instead of interpreting 'HHHHHHHHTH' as a unique sequence, we can view it simply as an outcome that yields the numerical value 9. This is where a random variable becomes useful. We can define a random variable, say :math:`X`, to map the outcome 'HHHHHHHHTH' to a numerical value that reflects the focus of our interest: .. math:: X('HHHHHHHHTH') = 9, and all other 1,023 outcomes in a similar manner. By using a random variable, we reduced our focus from 1,024 sequences to just 11 possible values (0 through 10). Expressing events with random variables ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ One of the key advantages of introducing a random variable is **conciseness**. Once an appropriate random variable is defined, most events can be expressed as equalities or inequalities involving the variable. See the table below for some examples: .. flat-table:: :align: center :header-rows: 1 :widths: 2 2 1 * - Description - Using set notations - Using random variable :math:`X` * - Event that there are three heads in the sequence - Define :math:`A_3` as the name of the event. List all sequences with three heads in :math:`A_3 = \{\cdots\}`. - :math:`X=3` * - Event that there are more than 7 heads in the sequence - Define :math:`A_8, A_9, A_{10}` as the events of sequences with 8, 9, and 10 heads, respectively. The event of interest is :math:`A_8 \cup A_9 \cup A_{10}`. - :math:`X > 7` We no longer need to define a new event for every new question. Instead, we can express various situations compactly using the random variable :math:`X`. .. Formally, this notation represents: .. math:: P(X = x) = P(\{\omega \in \Omega \mid X(\omega) = x\}) This means the probability that X equals some value x is the probability of the set of all outcomes ω in the sample space Ω such that when we apply our random variable function X to ω, we get the value x. Types of Random Variables ----------------------------- Random variables fall into two main categories based on the nature of their possible values: Discrete Random Variables ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A random variable is discrete if it can take on only a countable number of possible values. Discrete random variables typically arise when **counting** things, such as: * The number of heads in coin flips * The number of times someone swipes right out of 100 profile views * The number of website hits during a specific time period * The number of customers until the first big-ticket item is sold Continuous Random Variables ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A random variable is continuous if it can take on any value within a continuous range or interval. Continuous random variables typically arise when **measuring** quantities, such as: * Height, weight, or other physical measurements * Time until a particular event occurs * Temperature, pressure, or other environmental measurements Probability Distributions ---------------------------------------- To describe the probabilistic behavior of a random variable, we must specify the probabilities associated with all its possible values. This complete description is called a probability distribution. Discrete and continuous random variables have different types of probability distributions. Discrete random variables is described by a **probability mass function** (PMF), while a continuous random variables is described by a **probability density function** (PDF). In this chapter, we will focus on discrete random variables and PMFs. As we progress through the course, we will see how PMFs and PDFs share some foundational ideas, while differing in important ways. Probability Mass Functions ------------------------------ Definition ~~~~~~~~~~~~~ The probability mass function of a discrete random variable :math:`X` is denoted by :math:`p_X`. For each possible value :math:`x` that :math:`X` can take, it gives .. math:: p_X(x) = P(X = x). Different forms of a PMF ~~~~~~~~~~~~~~~~~~~~~~~~~ A PMF can be represented in several different forms: 1. A PMF can be organized into a **table** by listing the possible values with their corresponding probabilities. .. figure:: https://yjjpfnblgtrogqvcjaon.supabase.co/storage/v1/object/public/stat-350-assets/images/chapter5/table-pmf-general.png :alt: A table form of PMF :align: center :figwidth: 60% Exapmle of a PMF in table form 2. Dot plots or bar graphs that display the probability of each possible value can serve as visual representations of a PMF. However, they are not typically used on their own, as it can be difficult to determine exact probabilities unless the plot is very simple. 3. For some special random variables, a mathematical formula is used to describe the PMF. For example, .. math:: p_X(x) = \frac{e^{-\lambda} \lambda^x}{x!}, \text{ for } x \geq 0 is a PMF. Support ~~~~~~~~~~~~~ The **support** of a discrete random variable is the set of all possible values that have a positive probability: .. math:: \text{supp}(X) = \{x \in \mathbb{R} \mid p_X(x) > 0\}. Validity of a PMF ~~~~~~~~~~~~~~~~~~~~~ For a probability mass function to be **valid**, the following conditions must be satisfied: 1. **Non-negativity** For all :math:`x`, :math:`0 \leq p_X(x) \leq 1.` 2. **Total probability of 1** The sum of probabilities over all values in the support must equal 1: .. math:: \sum_{x \in \text{supp}(X)} p_X(x) = 1 Important Types of Problems Involving PMFs ------------------------------------------------ A. Constructing a PMF from Scracth ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ It is an important skill for statisticians to be able to "translate" descriptions of a random experiment in plain language to mathematical language involving a random variable and its PMF. .. admonition:: Example💡: Flipping a Biased Coin :class: note Let us try constructing a PMF from scratch, only using descriptions of the experimental setting. Suppose we flip a biased coin four times, where the probability of heads on each flip is 0.7 (and tails is 0.3). We define a random variable **H to count the number of heads in the four flips**. Find the complete PMF for H. Verify that the PMF is valid. First, let's identify the sample space. There are :math:`2^4 = 16` possible sequences of heads and tails over four flips. However, rather than working with all 16 sequences individually, we can group them based on the number of heads: * H = 0: Only one sequence has zero heads (all tails: TTTT) * H = 1: Four sequences have exactly one head (HTTT, THTT, TTHT, TTTH) * H = 2: Six sequences have exactly two heads * H = 3: Four sequences have exactly three heads * H = 4: Only one sequence has all four heads (HHHH) Using the independence of the coin flips and the given probabilities, * P(H = 0) = P(TTTT) = (0.3)⁴ = 0.0081 * P(H = 1) = 4 × (0.3)³ × (0.7) = 0.0756 * P(H = 2) = 6 × (0.3)² × (0.7)² = 0.2646 * P(H = 3) = 4 × (0.3) × (0.7)³ = 0.4116 * P(H = 4) = (0.7)⁴ = 0.2401 All probabilities are betwen 0 and 1, satisfying the first condition for validity. The probabilities also sum to 1: 0.0081 + 0.0756 + 0.2646 + 0.4116 + 0.2401 = 1. This gives us the complete PMF for our random variable H. .. figure:: https://yjjpfnblgtrogqvcjaon.supabase.co/storage/v1/object/public/stat-350-assets/images/chapter5/coin-table-pmf.png :alt: PMF for biased coin example in table format :align: center :width: 70% Probability mass function for the number of heads in four flips .. figure:: https://yjjpfnblgtrogqvcjaon.supabase.co/storage/v1/object/public/stat-350-assets/images/chapter5/coin-graph-pmf.png :alt: PMF for biased coin example in graph format :align: center :width: 90% Visualization of the PMF for the number of heads in four flips The PMF reveals that getting three heads is the most likely outcome, with a probability of approximately 0.41, while getting zero heads is very unlikely, with a probability of only about 0.008. B. Completing a Partially Known PMF ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Completing a partially specified PMF is a common task in statistics. Typical scenarios include: * The probability is unknown for one value in the support. * Multiple probabilities are unknown, with additional constraints provided. * The coefficient :math:`k` that turns a non-negative function :math:`f(x)` into a valid PMF :math:`p_X(x) = kf(x)` is unknown. This constant :math:`k` is called the **normalization constant**. In all such cases, we must "fill in the blanks" by applying the conditions of a valid PMF. .. admonition:: Example💡: Finding the normalization constant :class: note Consider a potential PMF: .. math:: p_X(x) = \begin{cases} \frac{k}{16} & \text{for } x = 0, 1 \\ \frac{k}{32} & \text{for } x = 2, 3, 4 \\ \frac{k}{64} & \text{for } x = 5, 6\\ 0 & \text{for all other} x \end{cases} To make this a valid PMF, we need to find the value of k that ensures the probabilities sum to 1: .. math:: \frac{k}{16} + \frac{k}{16} + \frac{k}{32} + \frac{k}{32} + \frac{k}{32} + \frac{k}{64} + \frac{k}{64} = 1 Multiplying both sides by 64 and solving for :math:`k`, .. math:: &4k + 4k + 2k + 2k + 2k + k + k = 64\\ &16k = 64 \\ &k = 4 Therefore, the valid PMF is: .. math:: p_X(x) = \begin{cases} \frac{1}{4} & \text{for } x = 0, 1 \\ \frac{1}{8} & \text{for } x = 2, 3, 4 \\ \frac{1}{16} & \text{for } x = 5, 6\\ 0 & \text{for all other} x \end{cases} C. Calculating Probabilities with PMFs ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Once we have a complete PMF, we can calculate probabilities for various events related to the random variable. Viewing events as equalities and inequalities involving a random variable, we can express probablities of unions, intersections, and complements concisely in terms of :math:`p_X(x)`. Let us first get some practice writing proability statements correctly in terms of :math:`X`. **Example**: Consider a random variable :math:`X` which has the set of **positive integers as its support**. .. flat-table:: :header-rows: 2 :align: center :widths: 1 2 2 * - :cspan:`2` Probability statements for discrete RVs * - Description - Expresssion in terms of :math:`p_X(x)` - Comment * - Probability that X is less than 4 - .. math:: &P(X < 4) \\ &= P(X=1 \text{ OR } X=2 \text{ OR } X=3) \\ &= P(X=1 \cup X=2 \cup X=3)\\ &= P(X=1) + P(X=2) + P(X=3)\\ &= p_X(1) + p_X(2) + p_X(3) - The transition from the third to the fourth line works because the events :math:`\{X=x\}` are **disjoint** for different values of :math:`x.` * - Probability that X is less than 4 **and** at least 2 - .. math:: &P(X < 4 \cap X \geq 2)\\ &= P(2 \leq X < 4)\\ & = p_X(2) + p_X(3) - For intersections and unions of non-disjoint events, think of ways to combine the two separate (in)equalities into one. * - Probability that X is at least than 4 **or** greater than 6 - .. math:: &P(X4 \geq \cup X>6) \\ &= P(X \geq 4) \\ &= 1 - P(X < 4) - To compute :math:`P(X \geq 4)` directly, we would have to sum infinitely many terms. Using the complement rule simplifies computation. Now, let us apply these skills to solve a problem. .. admonition:: Example💡: Computing probabilities using PMF :class: note Using the PMF we just derived, let's calculate some probabilities. .. math:: p_X(x) = \begin{cases} \frac{1}{4} & \text{for } x = 0, 1 \\ \frac{1}{8} & \text{for } x = 2, 3, 4 \\ \frac{1}{16} & \text{for } x = 5, 6\\ 0 & \text{for all other} x \end{cases} 1. The probability that X is even: .. math:: P(X \text{ is even}) &= P(X = 0) + P(X = 2) + P(X = 4) + P(X = 6) \\ &= 1/4 + 1/8 + 1/8 + 1/16 = 9/16 2. The probability that X is greater than 3: .. math:: P(X > 3) &= P(X = 4) + P(X = 5) + P(X = 6) \\ &= 1/8 + 1/16 + 1/16 = 1/4 3. Are the events "X = 5 or X = 6" and "X > 3" **independent**? To show independence between two events :math:`A` and :math:`B`, we must show that they meet the definition of idependence. That is, we must show :math:`P(A|B) = P(A)` or :math:`P(B|A)P(A).` .. math:: P(X = 5 \text{ or } X = 6 | X > 3) &= \frac{P((X = 5 \cup X = 6) \cap (X > 3))}{P(X > 3)}\\ &= \frac{P(X = 5 \cup X = 6)}{P(X > 3)}\\ &= (1/16 + 1/16)/(1/4) = 1/2\\ P(X = 5 \cup X = 6) &= 1/16 + 1/16 = 1/8 Since 1/2 ≠ 1/8, these events are not independent. Bringing It All Together ---------------------------- .. admonition:: Key Takeaways 📝 :class: important 1. **Random variables** map outcomes from the sample space to numerical values, allowing us to focus on quantities of interest rather than complex sets. 2. **Discrete random variables** take on countable values and are typically used when counting things, while **continuous random variables** can take any value in a continuum and are used for measurements. 3. A **probability mass function (PMF)** specifies the probability that a discrete random variable equals each possible value in its support. 4. Valid PMFs must satisfy two conditions: (1) all probabilities are between 0 and 1, and (2) the sum of all probabilities equals 1. 5. We can calculate probabilities for various events by rewriting the probability statements in terms of the PMF. Exercises ~~~~~~~~~~~~~~~~ 1. **Terminology Check**: Explain the difference between a discrete and a continuous random variable. Give two examples of each that were not mentioned in the chapter. 2. **Dice Sum**: Two fair dice are rolled. Let X be the random variable that represents the sum of the two values. a) What is the support of X? b) Construct the PMF for X. c) Find P(X is odd). d) Find P(X > 8). 3. **Card Draw**: A card is drawn randomly from a standard deck. Define the random variable X as follows: * X = 1 if the card is an ace * X = 11 if the card is a face card (jack, queen, or king) * X = the number on the card for all other cards (2 through 10) a) Construct the PMF for X. b) What is P(X ≥ 5)? c) Find P(X = 11 | X > 5). 4. **PMF Validation**: Determine if the following functions are valid PMFs for a discrete random variable X. If not, explain why. a) p_X(x) = 0.2 for x = 1, 2, 3, 4, 5 b) p_X(x) = x/15 for x = 1, 2, 3, 4, 5 c) p_X(x) = 1/(x+1) for x = 1, 2, 3, 4 5. **Independence Check**: For the biased coin example in the chapter (with P(Heads) = 0.7), let H be the random variable counting the number of heads in four flips. a) Are the events "H is odd" and "H > 2" independent? Show your work. b) Find two other events defined in terms of H that are independent.