.. _6-1-continuous-rvs-and-pdfs: .. raw:: html
Continuous Random Variables and Probability Density Functions ================================================================================== What happens when we want to model **measurements** rather than counts? How do we handle quantities like height, weight, temperature, or time—variables that can take on any value within a continuous range? This section explores the shift in mathematical framework that occurs when we move from the discrete world of "**how many**" to the continuous world of "**how much**." .. admonition:: Road Map 🧭 :class: important • Understand why **continuous random variables** require a different mathematical approach than discrete ones. • Define **probability density functions (PDFs)** as the continuous analog to probability mass functions. • Master the crucial concept that **probabilities are areas** under the PDF. • Learn the essential **properties** that make a function a valid PDF. • Find probabilities for continuous random variables by integrating the PDF. Discrete vs. Continuous: The Key Distinction ----------------------------------------------------------- For random variables with a discrete support, we could assign **positive probabilities to individual outcomes**. It made perfect sense to say "the probability of getting exactly 3 heads in 10 coin flips is some specific value" because 3 was one of only eleven distinct outcomes (0 through 10 heads). Even when the support is **countably infinite**—as in the Poisson distribution—we could still assign probabilities in such a way that each value in the support had a positive probability (however small) while the total sum remains 1. But many real-world phenomena involve measurements along a continuous scale, which has a vastly larger support than that of any discrete random variable. While we might record a person's height as "5 feet 8 inches," the actual height could be 5.75000... feet or 5.750001... feet. Between any two measurements, no matter how close, there are **uncountably many** possible intermediate values. If we tried to assign positive probabilities to each possible height—no matter how cleverly—we would end up with an infinite total, violating the fundamental requirement that all probabilities sum (or integrate) to 1. The Resolution: Zero Probability for Any Single Point ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We resolve this paradox by accepting that **any single exact value has probability zero** for continuous random variables. The probability that someone's height is exactly 5.750000000... feet with infinite precision is zero, even though this height is perfectly possible. This might seem counterintuitive at first. How can something be possible but have zero probability? Recall that probability can be seen as the relative size of an event compared to the whole. In the continuous case, we're dealing with uncountably many possible values packed into any interval, so many that any single point is negligible in comparison. This makes the *relative size*, and thus the probability, of any one value equal to zero. Then what has a positive probability? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For continuous random variables, we discuss probabilities of *intervals* of values rather than single points. This aligns naturally with the graphical interpretation of probabilities as areas under a curve--regions with non-zero width will have a positive area, while a single point always has zero area. Probability Density Functions: The Continuous Analog of Probability Mass Functions ------------------------------------------------------------------------------------------------- From Histograms to Curves ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Since we can't assign probabilities to individual points, we need a different approach than PMF to describe the distribution of a continuous random variable. We can think of a continuous probability distribution as a curve that represents the **limiting behavior of increasingly fine histograms for an increasingly large dataset**. In :numref:`hist-to-pdf`, the jagged histogram begins to approximate a curve as we collect more data and make the bins narrower. In the limit—with infinite data and infinitesimally narrow bins—we get a **probability density function**. .. _hist-to-pdf: .. figure:: https://yjjpfnblgtrogqvcjaon.supabase.co/storage/v1/object/public/stat-350-assets/images/chapter6/hist-to-pdf.png :alt: Evolution from histogram to probability density function :align: center :figwidth: 90% Evolution from histogram to a probability density function Mathematical Definition ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A **probability density function (PDF)** for a continuous random variable :math:`X`, denoted :math:`f_X(x)`, specifies how "dense" the probability is around each point. Mathematically, .. math:: f_X(x) = \lim_{\Delta \to 0^+} \frac{P(x < X \leq x + \Delta)}{\Delta}. Interpreting a PDF ~~~~~~~~~~~~~~~~~~~~~ It is important to note that :math:`f_X` evaluated at any point :math:`x` tells us about the **relative** likelihood of values in that neighborhood. Suppose a random variable :math:`X` gives :math:`f_X(5.8) = 3` and :math:`f_X(6.2) = 1`. We observe that: * Values in a small neighborhood around 5.8 is three times more likely to occur than values in the neighborhood of 6.2. * :math:`f_X` does NOT give probabilities directly. :math:`f_X(6.2) = 1` does NOT mean that the exact value of 6.2 occurs with probability 1. * Evaluations of :math:`f_X` are not restricted to be at most 1. The set of rules for validity of a PDF will be discussed below. Support ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. figure:: https://yjjpfnblgtrogqvcjaon.supabase.co/storage/v1/object/public/stat-350-assets/images/chapter6/support.png :alt: Support shown on a pdf graph :align: center :figwidth: 60% The **support** of a continuous random variable :math:`X` is the set of all possible values that :math:`X` can take, or equivalently, the set of values where its PDF is strictly positive: .. math:: \text{supp}(X) = \{x \in \mathbb{R} \mid f_X(x) > 0\}. Computing Probabilities: Areas Under the PDF ----------------------------------------------------- .. figure:: https://yjjpfnblgtrogqvcjaon.supabase.co/storage/v1/object/public/stat-350-assets/images/chapter6/prob-a-b.png :alt: Probability between a and b on a PDF :align: center :figwidth: 60% For a continuous random variable :math:`X` with PDF :math:`f_X(x)`, the probability that :math:`X` takes a value between :math:`a` and :math:`b` is the area under :math:`f_X(x)` between points :math:`a` and :math:`b`. Mathematically, .. math:: P(a \leq X \leq b) = \int_a^b f_X(x) \, dx Special Case: Probability at a Single Point ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. figure:: https://yjjpfnblgtrogqvcjaon.supabase.co/storage/v1/object/public/stat-350-assets/images/chapter6/fx-not-prob.png :alt: Probability of a single point :align: center :figwidth: 60% For any specific value :math:`a`, .. math:: P(X = a) = P(a \leq X \leq a) = \int_a^a f_X(x) \, dx = 0 Any integral from a point to itself is zero because the interval has zero width. Note that :math:`f_X(a)` evaluated at any :math:`a` in the support is positive. This again highlights the fact that evaluating :math:`f_X` at a point does **not** directly give its probability. An Important Consequence: Equality Doesn't Matter ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Because any single point has probability zero, it doesn't matter whether we use strict inequalities or include equality: .. math:: P(a \leq X \leq b) = P(a \leq X < b) = P(a < X \leq b) = P(a < X < b) This is a major difference from discrete random variables, where :math:`P(X = k)` could be positive, making the choice between < and ≤ crucial. Properties of A Valid Probability Density Function --------------------------------------------------------- Not every function can serve as a PDF. A valid PDF must satisfy two essential properties that parallel those required of discrete probability mass functions. Property 1: Non-Negativity ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The PDF must be non-negative everywhere. That is, .. math:: f_X(x) \geq 0 \text{ for all } x. This makes intuitive sense—there cannot be a likelihood smaller than none (zero). However, unlike discrete PMFs, PDFs are not constrained to values less than or equal to 1. They can take arbitrarily large values at some points, as long as they satisfy the next property. Property 2: Total Area Equals One ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The total area under the PDF must equal 1, or equivalently, .. math:: \int_{-\infty}^{\infty} f_X(x) \, dx = 1. This implies that the total probability of observing any value within the support is equal to 1. .. admonition:: Example💡: Validating and Working with a Simple PDF :class: note Suppose **the maximum diameter of a potato chip** (:math:`X`) produced at Factory A, in inches, follows the following probability density: .. math:: f_X(x) = \begin{cases} &12(x-0.5)^2(1.5-x), &0.5 \leq x \leq 1.5\\ &0, &\text{ Otherwise} \end{cases} 1. Identify the support of :math:`X`, sketch :math:`f_X(x)`, then verify that it is a legitimate PDF. .. figure:: https://yjjpfnblgtrogqvcjaon.supabase.co/storage/v1/object/public/stat-350-assets/images/chapter6/potato-chip-pdf.png :alt: Sketch of the pdf of max diameter of potato chips :align: center :width: 60% Sketch of the PDF of maximum diamters of potato chips :math:`\text{supp}(X) = [0.5, 1.5]`. :math:`f_X(x)` takes a non-negative value at any :math:`x \in \mathbb{R}` as evident from the sketch. We now verify that the integral of the PDF from :math:`-\infty` to :math:`\infty` equals 1: .. math:: \int_{-\infty}^\infty f_X(x)dx &= \int_{-\infty}^{0.5} f_X(x)dx + \int_{0.5}^{1.5}f_X(x)dx + \int_{1.5}^{\infty} f_X(x)dx\\ &= 0 + \int_{0.5}^{1.5} 12(x-0.5)^2(1.5-x) dx + 0\\ Above, we first split the intetral to the sum of integrals over three intervals. This step makes it evident that the integral below :math:`0.5` and above :math:`1.5` is not arbitrarily omitted from computation--they simply contribute zero area to the integral. Continuting, .. math:: &\int_{0.5}^{1.5} 12(x-0.5)^2(1.5-x) dx\\ &= \int_{0.5}^{1.5} (-12x^3 + 30x^2-21x + 4.5)dx\\ &=\frac{-12x^4}{4} + \frac{30x^3}{3} - \frac{21x^2}{2} + 4.5x \Bigg\rvert_{0.5}^{1.5}\\ &=-3x^4 + 10x^3 - 10.5x^2 + 4.5x \Bigg\rvert_{0.5}^{1.5}\\ &= 1.6875-0.6875 = 1. Therefore, :math:`f_X` satisfies both requirements for a valid PDF. 2. For a quality control procedure, managers of the factory has collected all the potato chips whose maximum diameter is **smaller than 1"**. What is the probability that a randomly selected potato chip in this pool has a maximum diameter **greater than 0.8"**? The first task is to write the goal of the problem in correct probability statement. Since we have the information that the chips will always have a maximum diameter less than 1, .. math:: P(X > 0.8 | X < 1) = \frac{P(\{X > 0.8\} \cap \{X < 1 \})}{P(X < 1)}. The diameter can be less than 0.8 **and** less than 1 only if it is between the two values. We simplify the numerator accordingly: .. math:: P(X < 0.8 | X < 1) = \frac{P(0.8