Slides 📊

6.1. Continuous Random Variables and Probability Density Functions

What happens when we want to model measurements rather than counts? How do we handle quantities like height, weight, temperature, or time—variables that can take on any value within a continuous range? This section explores the shift in mathematical framework that occurs when we move from the discrete world of “how many” to the continuous world of “how much.”

Road Map 🧭

Understand why continuous random variables require a different mathematical approach than discrete ones.
Define probability density functions (PDFs) as the continuous analog to probability mass functions.
Master the key concept that probabilities are areas under a PDF.
Learn the essential properties that make a function a valid PDF.
Find probabilities for continuous random variables by integrating the PDF.

6.1.1. Discrete vs. Continuous Random Variables

For random variables with a discrete support, we could assign positive probabilities to individual outcomes. It made perfect sense to say “the probability of getting exactly 3 heads in 10 coin flips is some specific value” because 3 was one of only eleven distinct outcomes (0 through 10 heads). Even when the support is countably infinite—as in the Poisson distribution—we could still assign probabilities in such a way that each value in the support had a positive probability (however small) while the total sum remains 1.

But many real-world phenomena involve measurements along a continuous scale, which has a vastly larger support than that of any discrete random variable. While we might record a person’s height as “5 feet 8 inches,” the actual height could be 5.75000… feet or 5.750001… feet. Between any two measurements, no matter how close, there are uncountably many possible intermediate values. If we tried to assign positive probabilities to each possible height—no matter how cleverly—we would end up with an infinite total, violating the fundamental requirement that all probabilities sum (or integrate) to 1.

The Resolution: Zero Probability for Any Single Point

We resolve this paradox by accepting that any single exact value has probability zero for continuous random variables. The probability that someone’s height is exactly 5.750000000… feet with infinite precision is zero, even though this height is perfectly possible.

This might seem counterintuitive at first. How can something be possible but have zero probability? Recall that probability can be seen as the relative size of an event compared to the whole. In the continuous case, we’re dealing with uncountably many possible values packed into any interval, so many that any single point is negligible in comparison. This makes the relative size, and thus the probability, of any one value equal to zero.

Then What Has a Positive Probability?

For continuous random variables, we discuss probabilities of intervals of values rather than single points. This aligns naturally with the graphical interpretation of probabilities as areas under a curve–regions with non-zero width will have a positive area, while a single point always has zero area.

6.1.2. Probability Density Functions: The Continuous Analog of Probability Mass Functions

From Histograms to Curves

Since we can’t assign probabilities to individual points, we need a different approach than PMF to describe the distribution of a continuous random variable. We can think of a continuous probability distribution as a curve that represents the limiting behavior of increasingly fine histograms for an increasingly large dataset.

In Fig. 6.1, the jagged histogram begins to approximate a curve as we collect more data and make the bins narrower. In the limit—with infinite data and infinitesimally narrow bins—we get a probability density function.

Evolution from histogram to probability density function — Fig. 6.1 Evolution from histogram to a probability density function

Mathematical Definition

A probability density function (PDF) for a continuous random variable \(X\), denoted \(f_X(x)\), specifies how “dense” the probability is around each point.

Mathematically,

\[f_X(x) = \lim_{\Delta \to 0^+} \frac{P(x < X \leq x + \Delta)}{\Delta}.\]

Interpreting a PDF

It is important to note that \(f_X\) evaluated at any point \(x\) tells us about the relative likelihood of values in that neighborhood. Suppose a random variable \(X\) gives \(f_X(5.8) = 3\) and \(f_X(6.2) = 1\). We observe that:

Values in a small neighborhood around 5.8 is three times more likely to occur than values in the neighborhood of 6.2.
\(f_X\) does NOT give probabilities directly. \(f_X(6.2) = 1\) does NOT mean that the exact value of 6.2 occurs with probability 1.
Evaluations of \(f_X\) are not restricted to be at most 1. The set of rules for validity of a PDF will be discussed below.

Support

The support of a continuous random variable \(X\) is the set of all possible values that \(X\) can take, or equivalently, the set of values where its PDF is strictly positive:

\[\text{supp}(X) = \{x \in \mathbb{R} \mid f_X(x) > 0\}.\]

6.1.3. Computing Probabilities: Areas Under the PDF

For a continuous random variable \(X\) with PDF \(f_X(x)\), the probability that \(X\) takes a value between \(a\) and \(b\) is equal to the area under \(f_X(x)\) between points \(a\) and \(b\). Mathematically,

\[P(a \leq X \leq b) = \int_a^b f_X(x) \, dx\]

Special Case: Probability at a Single Point

For any specific value \(a\),

\[P(X = a) = P(a \leq X \leq a) = \int_a^a f_X(x) \, dx = 0\]

Any integral from a point to itself is zero because the interval has zero width. Note that \(f_X(a)\) evaluated at any \(a\) in the support is positive. This again highlights the fact that evaluating \(f_X\) at a point does not directly give its probability.

An Important Consequence: Equality Doesn’t Matter

Since any single point has zero probability, using strict or weak inequalities does not affect the probabilities:

\[P(a \leq X \leq b) = P(a \leq X < b) = P(a < X \leq b) = P(a < X < b)\]

This is a major difference from discrete random variables, where \(P(X = k)\) could be positive, making the choice between \(<\) and \(\leq\) crucial.

6.1.4. Properties of A Valid Probability Density Function

Not every function can serve as a PDF. A valid PDF must satisfy two essential properties that parallel those required of discrete probability mass functions.

Property 1: Non-Negativity

A PDF must be non-negative everywhere. That is,

\[f_X(x) \geq 0 \text{ for all } x \in \mathbb{R}.\]

This makes sense—there cannot be a likelihood smaller than none. However, unlike discrete PMFs, PDFs are not constrained to values less than or equal to 1. They can take arbitrarily large values at some points, as long as they satisfy the next property.

Property 2: Total Area Equals One

The total area under a PDF must equal 1, or equivalently,

\[\int_{-\infty}^{\infty} f_X(x) \, dx = 1.\]

This implies that the probability of observing some value within the support is equal to 1.

Example💡: Validating and Working with a Simple PDF

Suppose the maximum diameter of a potato chip (\(X\)) produced at Factory A, in inches, follows the following probability density:

\[\begin{split}f_X(x) = \begin{cases} &12(x-0.5)^2(1.5-x), &0.5 \leq x \leq 1.5\\ &0, &\text{ Otherwise} \end{cases}\end{split}\]

Identify the support of \(X\), sketch \(f_X(x)\), then verify that it is a legitimate PDF.

Fig. 6.2 Sketch of the PDF of maximum diamters of potato chips

\(\text{supp}(X) = [0.5, 1.5]\). \(f_X(x)\) takes a non-negative value at any \(x \in \mathbb{R}\) as evident from the sketch. We now verify that the integral of the PDF from \(-\infty\) to \(\infty\) equals 1:

\[\begin{split}\int_{-\infty}^\infty f_X(x)dx &= \int_{-\infty}^{0.5} f_X(x)dx + \int_{0.5}^{1.5}f_X(x)dx + \int_{1.5}^{\infty} f_X(x)dx\\ &= 0 + \int_{0.5}^{1.5} 12(x-0.5)^2(1.5-x) dx + 0\\\end{split}\]

Above, we first split the beginning intetral to a sum of integrals over three intervals. This step makes it evident that the integrals below \(0.5\) and above \(1.5\) are not just arbitrarily omitted from computation—they simply contribute zero area to the integral. Continuting,

\[\begin{split}&\int_{0.5}^{1.5} 12(x-0.5)^2(1.5-x) dx\\ &= \int_{0.5}^{1.5} (-12x^3 + 30x^2-21x + 4.5)dx\\ &=\frac{-12x^4}{4} + \frac{30x^3}{3} - \frac{21x^2}{2} + 4.5x \Bigg\rvert_{0.5}^{1.5}\\ &=-3x^4 + 10x^3 - 10.5x^2 + 4.5x \Bigg\rvert_{0.5}^{1.5}\\ &= 1.6875-0.6875 = 1.\end{split}\]

Therefore, \(f_X\) satisfies both requirements for a valid PDF.
For a quality control procedure, managers of the factory has collected all the potato chips whose maximum diameter is smaller than 1”. What is the probability that a randomly selected potato chip in this pool has a maximum diameter greater than 0.8”?

The first task is to write the goal of the problem in correct probability statement. Since we have the information that the chips will always have a maximum diameter less than 1,

\[P(X > 0.8 | X < 1) = \frac{P(\{X > 0.8\} \cap \{X < 1 \})}{P(X < 1)}.\]

The diameter can be greater than 0.8 and less than 1 only if it is between the two values. We simplify the numerator accordingly:

\[P(X < 0.8 | X < 1) = \frac{P(0.8 <X < 1)}{P(X < 1)}.\]

Now integrate for each probability:
- Numerator
  
  \[\begin{split}P(0.8 < X < 1) &= \int_{0.8}^{1} f_X(x)dx \\ &=-3x^4 + 10x^3 - 10.5x^2 + 4.5x \Bigg\rvert_{0.8}^{1}\\ &=0.2288\end{split}\]
- Denominator
  
  The denominator only has an upper boundary. This is equivalent to having a lower boundary of negative infinity, which gives us
  
  \[\begin{split}P(X < 1) &= \int_{-\infty}^{1} f_X(x)dx = \int_{0.5}^1 f_X(x)dx\\\end{split}\]
  
  The final step above is true because the integral of \(f_X(x)\) in any region below \(0.5\) results in 0. Continuing,
  
  \[P(X < 1) =-3x^4 + 10x^3 - 10.5x^2 + 4.5x \Bigg\rvert_{0.5}^{1} = 0.3125\]
Finally,

\[P(X < 0.8 | X < 1) = \frac{0.2288}{0.3125} = 0.73216\]

6.1.5. Important Types of Problems invloving PDFs

A. Handling Complex Distributions

Real-world continuous distributions aren’t always described by simple functions. Common examples of more complex distributions are:

Example of a piecewise-defined probability density function — Fig. 6.3 *A legitimate PDF can have many different functional forms*

PDFs with one or both ends of the support extending to infinity (left two on Fig. 6.3)
- PDFs of this type usually includes an exponential function. Make sure to review integration of simple exponential functions.
Piecewise-defined PDFs taking different functional forms over different regions of the support (right on Fig. 6.3)

When dealing with piecewise PDFs,
1. Identify all boundaries where the function forms change.
2. Break down any integrals spanning multiple regions into a sum of integals, each spanning only a single region.

Example💡: A Piecewise-defined PDF

A professor of STAT 1234 drops by a coffee shop on campus before or after his 50-minute lecture. Suppose the time he enters the coffee shop has a PDF shown below. The time \(x\) is expressed in minutes relative to the start time of his lecture.

\[\begin{split}f_X(x) = \begin{cases} &\frac{7}{500}(x+10), &-10 \leq x \leq 0\\ &\frac{3}{50}, &50 \leq x \leq 55\\ &0, &\text{ Otherwise} \end{cases}\end{split}\]

What is the probability that the professor enters the cofee shop within 3 minutes from his lecture?

We are looking for

\[P(-3 \leq X \leq 53) = \int_{-3}^{53} f_X(x)dx.\]

This interval spans three regions:

part of the first non-trivial region where \(-3 \leq x\leq 0\),
the middle region \(0 \leq x\leq 50\) where the PDF is zero, then
part of the second non-trivial region where \(50\leq x\leq 53\).

We must break up the integral into three pieces accordingly:

\[\begin{split}P(-3 \leq X \leq 53) &= \int_{-3}^{0} f_X(x)dx + \int_{0}^{50} f_X(x)dx + \int_{50}^{53} f_X(x)dx\\ &=\int_{-3}^{0} \frac{7}{500}(x+10) dx + \int_{0}^{50} 0 dx + \int_{50}^{53} \frac{3}{50}dx\\ &=\frac{7}{500}\left(\frac{x^2}{2} + 10x\right)\Bigg\rvert_{-3}^0 + \frac{3x}{50}\Bigg\rvert_{50}^{53}\\ &=0.357+0.18 = 0.537\end{split}\]

B. Completing a Partially Known PDF

Sometimes we encounter functions that have the right shape to be PDFs but don’t integrate to 1. We must fix this by finding an appropriate normalization constant.

Example💡: Finding the Normalization Constant

Suppose we have a function:

\[\begin{split}g(x) = \begin{cases} & x^2, & 0 \leq x \leq 2\\ &0, &\text{ elsewhere} \end{cases}.\end{split}\]

Q: What is the constant \(k\) which makes \(kg(x)\) a valid PDF?

We need to find \(k\) such that \(\int_{-\infty}^{\infty} kg(x) \, dx = 1\).

First, we compute the integral of \(g(x)\).

\[\int_{-\infty}^{\infty}g(x)dx = \int_{0}^{2} x^2 \, dx = \left[\frac{x^3}{3}\right]_0^2 = \frac{8}{3}.\]

To make the total area equal 1, we need

\[k \frac{8}{3} = 1 \implies k = \frac{3}{8}.\]

Therefore, the valid PDF is:

\[\begin{split}f_X(x) = \begin{cases} \frac{3}{8}x^2 & \text{for } 0 \leq x \leq 2 \\ 0 & \text{elsewhere} \end{cases}.\end{split}\]

6.1.6. Bringing It All Together

The transition from discrete to continuous random variables represents a fundamental shift in thinking. Instead of asking “what’s the probability of exactly this value?” we ask “what’s the probability of values in this range?” This change enables us to model the rich variety of measurement data we encounter.

Key Takeaways 📝

Continuous random variables model measurable quantities that can take any value within an interval, unlike discrete variables that count distinct outcomes.
Probability density functions (PDFs) describe how probability is distributed across the continuous range, with higher values indicating more likely regions.
Probabilities are areas under the PDF, computed using integration: \(P(a < X < b) = \int_a^b f_X(x) \, dx\).
Individual points have probability zero for continuous variables—only intervals have positive probability.
Valid PDFs must be non-negative everywhere and integrate to 1 over their entire support.

In the next section, we’ll explore how to calculate expected values and variances for continuous random variables, extending the concepts we mastered in the discrete case.

Exercises

Conceptual Understanding: Explain why \(P(X = 5.5) = 0\) for a continuous random variable, even when \(X = 5.5\) is a perfectly reasonable outcome.

Basic Probability Calculation: For the PDF

\[\begin{split}f_X(x) = \begin{cases} &\frac{3}{2}x^2 \, &\text{for} -1 \leq x \leq 1\\ &0 \, &\text{elsewhere} \end{cases},\end{split}\]
1. Verify this is a legitimate PDF.
2. Find \(P(X < 0.5)\).
3. Find \(P(0.2 < X < 0.8)\).
4. Find \(P(X = 0.3)\).
Piecewise PDF: For the following PDF,

\[\begin{split}f_X(x) = \begin{cases} kx & \text{for } 0 \leq x \leq 2 \\ k(4-x) & \text{for } 2 < x \leq 4 \\ 0 & \text{elsewhere} \end{cases}\end{split}\]
1. Find the value of \(k\) that makes this a valid PDF.
2. Sketch the PDF.
3. Find \(P(1 < X < 3)\).