Exponential Distribution
=======================================
We now encounter a distribution with new patterns of behavior: the exponential distribution.
Where uniform distributions model equal likelihood, exponential distributions capture the essence of
**decay and waiting times**.
.. admonition:: Road Map 🧭
:class: important
• Understand the **exponential distribution** as a model for waiting times between events.
• Distinguish between **Poisson** (counting events) and **exponential** (timing between events) distributions.
• Master the **PDF and CDF** formulas and their mathematical relationship.
• Explore the **memoryless property** of exponential random variables.
• Learn about **two parameterizations**: rate parameter :math:`\lambda` and mean parameter :math:`\mu`.
From Event Counting to Waiting Times
---------------------------------------
Consider a hospital emergency room where patients arrive according to a Poisson process
with an average rate of 3 patients per hour. The Poisson distribution tells us the probabilities
of seeing 0, 1, 2, or more patients in any given hour.
But what if we want to know: "If a patient just arrived, how long until the next patient arrives?"
This waiting time follows an exponential distribution. The connection is profound:
if events occur according to a Poisson process with rate :math:`\lambda`, then
the **time between consecutive events follows an exponential distribution** with the same rate parameter :math:`\lambda`.
Why "Exponential"?
~~~~~~~~~~~~~~~~~~~~~
The distribution gets its name from the exponential decay pattern in its probability density function.
It assigns highest probability density to small waiting times,
with probability density decreasing exponentially as waiting times increase.
This reflects an intuitive property of many real-world processes—short waiting times are much more
likely than long waiting times, but extremely long waiting times, while rare, remain possible.
Mathematical Definition and Properties
-----------------------------------------
The Exponential PDF
~~~~~~~~~~~~~~~~~~~~
A continuous random variable :math:`X` follows an exponential distribution if its probability density function is:
.. math::
f_X(x) = \begin{cases}
\lambda e^{-\lambda x} & \text{for } x \geq 0 \\
0 & \text{for } x < 0
\end{cases}
We write :math:`X \sim \text{Exp}(\lambda)` or :math:`X \sim \text{Exponential}(\lambda)`.
**Understanding the Components**
- **Rate parameter** :math:`\lambda`:
Just like in Poisson distributions, :math:`\lambda` represents the average number of events per unit time
and is therefore always positive. Higher values of :math:`\lambda` indicate more frequent events which lead to
shorter expected waiting times.
- **Exponential decay** :math:`e^{-\lambda x}` creates the characteristic decreasing curve.
- :math:`\text{supp}(X) =[0, \infty)` because waiting times cannot be negative.
- The PDF starts at its maximum value :math:`\lambda` when :math:`x = 0` and **decreases exponentially**,
approaching but **never reaching zero** as :math:`x \to \infty`.
.. figure:: https://yjjpfnblgtrogqvcjaon.supabase.co/storage/v1/object/public/stat-350-assets/images/chapter6/exponential-pdf.png
:alt: Exponential PDF showing decay pattern
:align: center
:figwidth: 40%
An exponential PDF
The Expoenential CDF
~~~~~~~~~~~~~~~~~~~~~~~
The cumulative distribution function requires integrating the PDF. For any :math:`x \geq 0`,
.. math::
F_X(x) &= P(X \leq x) = \int_0^x \lambda e^{-\lambda s} \, ds
= \lambda \left[ \frac{e^{-\lambda s}}{-\lambda} \right]_0^x\\
&= \lambda \left[ \frac{e^{-\lambda x} - e^0}{-\lambda} \right]
= \lambda \left[ \frac{e^{-\lambda x} - 1}{-\lambda} \right] = 1 - e^{-\lambda x}
Therefore, the exponential CDF is:
.. math::
F_X(x) = \begin{cases}
1 - e^{-\lambda x} & \text{for } x \geq 0 \\
0 & \text{for } x < 0
\end{cases}
.. figure:: https://yjjpfnblgtrogqvcjaon.supabase.co/storage/v1/object/public/stat-350-assets/images/chapter6/exponential-cdf.png
:alt: Exponential CDF
:align: center
:figwidth: 40%
An exponential CDF
**A Notable Property**
The exponential CDF approaches 1 as :math:`x \to \infty`, but technically only equals 1 in the limit.
This occurs because the PDF never actually reaches zero on the right tail—it only approaches zero asymptotically.
Expected Value and Variance
------------------------------
Expected Value
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To find :math:`E[X]`, we compute:
.. math::
E[X] = \int_0^{\infty} x \cdot \lambda e^{-\lambda x} \, dx
This requires integration by parts with :math:`u = x` and :math:`dv = \lambda e^{-\lambda x} dx`:
- :math:`du = dx`
- :math:`v = -e^{-\lambda x}`
.. math::
E[X] = \left[ x \cdot (-e^{-\lambda x}) \right]_0^{\infty} - \int_0^{\infty} (-e^{-\lambda x}) \, dx
The boundary term equals zero (since :math:`xe^{-\lambda x} \to 0` as :math:`x \to \infty` and equals 0 at :math:`x = 0`), leaving:
.. math::
E[X] = \int_0^{\infty} e^{-\lambda x} \, dx = \left[ \frac{e^{-\lambda x}}{-\lambda} \right]_0^{\infty} = \frac{1}{\lambda}
Variance
~~~~~~~~~~~~~~~~~~~~~~~~~
For variance, we use :math:`\text{Var}(X) = E[X^2] - (E[X])^2`. Finding :math:`E[X^2]` requires another integration by parts,
which we leave as an independent exercise.
.. math::
E[X^2] = \int_0^{\infty} x^2 \cdot \lambda e^{-\lambda x} \, dx = \frac{2}{\lambda^2}
Therefore,
.. math::
\text{Var}(X) = E[X^2] - (E[X])^2 = \frac{2}{\lambda^2} - \left(\frac{1}{\lambda}\right)^2 = \frac{2}{\lambda^2} - \frac{1}{\lambda^2} = \frac{1}{\lambda^2}
Summary
~~~~~~~~~~
For :math:`X \sim \text{Exp}(\lambda)`,
.. math::
E[X] = \frac{1}{\lambda} \quad \text{and} \quad \text{Var}(X) = \frac{1}{\lambda^2}.
The standard deviation is :math:`\sigma_X = \frac{1}{\lambda}`.
Interpretation
~~~~~~~~~~~~~~~
If events occur at rate :math:`\lambda` per unit time, we expect to wait :math:`\frac{1}{\lambda}`
time units on average until the next event. Higher rates mean shorter waiting times, and the variance
decreases as the rate increases, indicating more predictable waiting times.
Two Parameterizations: Rate vs. Mean
-------------------------------------
The exponential distribution can be parameterized in two equivalent ways, each emphasizing
different aspects of the underlying process.
Rate Parameterization
~~~~~~~~~~~~~~~~~~~~~~~~~
The rate parameterization uses :math:`\lambda > 0` as we've seen:
.. math::
f_X(x) = \lambda e^{-\lambda x}, \quad x \geq 0
Here, :math:`\lambda` represents **the average number of events per unit time**.
This parameterization is natural when we think about process rates: phone calls per minute,
component failures per year, or customer arrivals per hour.
Mean Parameterization
~~~~~~~~~~~~~~~~~~~~~~~~~~
The mean parameterization uses :math:`\mu = \frac{1}{\lambda}` as the parameter:
.. math::
f_X(x) = \frac{1}{\mu} e^{-x/\mu}, \quad x \geq 0
Here, :math:`\mu` represents the **average waiting time between events**.
This parameterization is natural when we think about typical waiting times:
average time between phone calls, mean component lifetime,
or expected time between customer arrivals.
Converting Between Parameterizations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The relationship :math:`\mu = \frac{1}{\lambda}` allows easy conversion:
- If events occur at rate :math:`\lambda = 3` per hour, the mean waiting time
is :math:`\mu = \frac{1}{3}` hour (20 minutes).
- If the mean waiting time is :math:`\mu = 2` minutes, the rate is
:math:`\lambda = \frac{1}{2}` events per minute (30 events per hour).
Choosing the Right Parameterization
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Use the parameterization that matches your problem's natural description.
- **Rate parameterization** when the problem gives or asks for rates ("failures per year", "arrivals per hour")
- **Mean parameterization** when the problem gives or asks for typical waiting times ("mean time to failure", "average service time")
The Memoryless Property
----------------------------------
For any exponential random variable :math:`X` and any positive values :math:`s` and :math:`t`:
.. math::
P(X > s + t \mid X > s) = P(X > t)
The memoryless property implies that if we've already waited time :math:`s` without an event occurring,
the probability of waiting an additional time :math:`t` is **the same as if we were starting fresh**.
The process **"forgets" how long we've already waited**—past waiting time provides no information
about future waiting time.
Proving the Memoryless Property
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Using the definition of conditional probability and the exponential CDF,
.. math::
P(X > s + t \mid X > s) = \frac{P(X > s + t \cap X > s)}{P(X > s)} = \frac{P(X > s + t)}{P(X > s)}
Using the CDF :math:`F_X(x) = 1 - e^{-\lambda x}`, we get :math:`P(X > x) = e^{-\lambda x}`,
.. math::
P(X > s + t \mid X > s) = \frac{e^{-\lambda(s+t)}}{e^{-\lambda s}}
= \frac{e^{-\lambda s} \cdot e^{-\lambda t}}{e^{-\lambda s}} = e^{-\lambda t} = P(X > t)✔
.. admonition:: Example💡: Exponential Distribution
:class: note
Customers arrive at a service desk with exponentially distributed inter-arrival times averaging 4 minutes.
a) What's the probability that no customer arrives for the next 2 minutes?
Let :math:`Y` denote the inter-arrival wait time. :math:`Y \sim \text{Exp}(\mu=4)`. The parameter
is provided as an average time, so we use :math:`\mu`.
.. math:: P(Y > 2) = 1 - F_Y(y) = 1 - (1 - e^{-2/4}) = e^{-2/4} = 0.6065
b) If no customer has arrived in the last 6 minutes, what's the probability one arrives in the next minute?
Begin by setting up the probability statement. We are looking for :math:`P(Y < 6 + 1 | Y > 6)`.
Using memoryless property of exponential :math:`Y`, this is equal to :math:`P(Y < 1)`.
.. math:: P(Y < 1) = 1 - e^{-1/4} = 0.2212
c) Find the time :math:`t` such that 90% of inter-arrival times are less than :math:`t`.
We need to find :math:`t` such that :math:`P(Y \leq t) = 0.9`. Replacing the left-hand side with
the specific expression and solving for :math:`t`,
.. math:: 1 - e^{-t/4} = 0.9 \implies t = -4\cdot\text{ln}(0.1)=9.21
Properties Summary
-----------------------------------
For quick reference, here are the essential properties of the exponential distribution:
**Notation**: :math:`X \sim \text{Exp}(\lambda)` or :math:`X \sim \text{Exponential}(\lambda)`
**Parameter**: :math:`\lambda > 0` (rate parameter) or :math:`\mu = \frac{1}{\lambda} > 0` (mean parameter)
**Support**: :math:`[0, \infty)`
**PDF**: :math:`f_X(x) = \begin{cases} \lambda e^{-\lambda x} & \text{for } x \geq 0 \\ 0 & \text{for } x < 0 \end{cases}`
**CDF**: :math:`F_X(x) = \begin{cases} 1 - e^{-\lambda x} & \text{for } x \geq 0 \\ 0 & \text{for } x < 0 \end{cases}`
**Expected Value**: :math:`E[X] = \frac{1}{\lambda}`
**Variance**: :math:`\text{Var}(X) = \frac{1}{\lambda^2}`
**Standard Deviation**: :math:`\sigma_X = \frac{1}{\lambda}`
**Memoryless Property**: :math:`P(X > s + t \mid X > s) = P(X > t)` for all :math:`s, t > 0`
Distinguishing Between Poisson and Exponential Random Variables
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This relationship often causes confusion, so let's go through the differences
explicitly:
.. flat-table::
:header-rows: 1
:width: 100%
:align: center
* -
- Poisson Distribution
- Exponential Distribution
* - What does it describe?
- # events per unit
- Time until the next event
* - Discrete or continuous?
- Discrete with support :math:`\{0, 1, 2, \cdots\}`
- Continuous with support :math:`[0, \infty)`
* - Parameter
- :cspan:`1` :math:`\lambda` = average number of events per time unit
* - Typical question
- What is the probability that **3 customers** arrive in the next hour?
- What is the probability that no customer arrives for the next **40 minutes**?
.. figure:: https://yjjpfnblgtrogqvcjaon.supabase.co/storage/v1/object/public/stat-350-assets/images/chapter6/poisson-exp-diagram.png
:alt: Diagram of Poisson process with Poisson and exponential rvs labeled
:align: center
:figwidth: 30%
Bringing It All Together
--------------------------
.. admonition:: Key Takeaways 📝
:class: important
1. The **exponential distribution** models waiting times between events in processes where events occur
at a constant average rate.
2. **Connection to Poisson**: If events follow a Poisson process with rate :math:`\lambda`, inter-arrival
times follow :math:`\text{Exp}(\lambda)`.
3. The **PDF** :math:`f_X(x) = \lambda e^{-\lambda x}` creates an exponential decay pattern.
4. The **CDF** is :math:`F_X(x) = 1 - e^{-\lambda x}`.
5. **Two parameterizations** exist: rate :math:`\lambda` (events per time) and
mean :math:`\mu = \frac{1}{\lambda}` (average waiting time).
6. The **memoryless property** :math:`P(X > s + t \mid X > s) = P(X > t)` uniquely
characterizes exponential distributions among continuous distributions.
Exercises
~~~~~~~~~~~~~~
1. **Basic Properties**: For :math:`X \sim \text{Exp}(3)`,
a) Find :math:`P(X > 2)`.
b) Calculate :math:`P(1 \leq X \leq 2)`.
c) Find the mean and standard deviation.
d) Determine the 75th percentile.
2. **Memoryless Property**: A component has exponentially distributed lifetime with mean 8 years.
a) If the component has already operated for 5 years, what's the probability it operates for at least 3 more years?
b) Compare this to the probability that a new component operates for at least 3 years.
c) Explain why these probabilities are equal.
3. **Parameterization Practice**: Convert between parameterizations:
a) If :math:`X \sim \text{Exp}(\lambda = 0.25)`, what is the mean parameter :math:`\mu`?
b) If the mean waiting time is 15 minutes, what is the rate parameter per hour?
c) Express the PDF using both parameterizations for part (b).
4. **Connection to Poisson**: If phone calls arrive according to a Poisson process with rate 2 calls per hour,
a) What distribution models the time between consecutive calls?
b) What's the probability of waiting more than 45 minutes between calls?
c) If a call just ended, what's the expected time until the next call?
5. **Multiple Components**: A system has three independent components with exponential lifetimes:
rates 0.1, 0.2, and 0.3 failures per year.
a) What's the distribution of the time until the system fails (first component failure)?
b) What's the probability the system survives 2 years?
c) What's the expected system lifetime?
..
Working with Exponential Distributions in R
-----------------------------------------------
R provides comprehensive functions for working with exponential distributions, making calculations straightforward for both parameterizations.
**The Four Essential R Functions**
- **rexp()**: Generates random samples from an exponential distribution
- **dexp()**: Calculates the probability density function (PDF values)
- **pexp()**: Calculates the cumulative distribution function (CDF values)
- **qexp()**: Finds quantiles (percentiles)
**Basic Function Usage**
.. code-block:: r
# Generate random samples from Exp(rate = 2)
set.seed(123)
random_times <- rexp(n = 10, rate = 2)
print(round(random_times, 3))
# Calculate density at x = 1.5 for Exp(rate = 2)
density_at_1.5 <- dexp(x = 1.5, rate = 2)
print(paste("Density at x = 1.5:", round(density_at_1.5, 4)))
# Calculate P(X ≤ 2) for Exp(rate = 2)
prob_less_than_2 <- pexp(q = 2, rate = 2)
print(paste("P(X ≤ 2) =", round(prob_less_than_2, 4)))
# Find the median (50th percentile)
median_value <- qexp(p = 0.5, rate = 2)
print(paste("Median:", round(median_value, 3)))
**Probability Calculations**
.. code-block:: r
# For X ~ Exp(rate = 0.5), calculate various probabilities
# P(X > 3) using lower.tail = FALSE
prob_greater_3 <- pexp(3, rate = 0.5, lower.tail = FALSE)
print(paste("P(X > 3) =", round(prob_greater_3, 4)))
# P(1 ≤ X ≤ 4) = P(X ≤ 4) - P(X ≤ 1)
prob_between <- pexp(4, rate = 0.5) - pexp(1, rate = 0.5)
print(paste("P(1 ≤ X ≤ 4) =", round(prob_between, 4)))
# Demonstrate memoryless property
# P(X > 5 | X > 2) should equal P(X > 3)
conditional_prob <- pexp(5, rate = 0.5, lower.tail = FALSE) /
pexp(2, rate = 0.5, lower.tail = FALSE)
direct_prob <- pexp(3, rate = 0.5, lower.tail = FALSE)
print(paste("P(X > 5 | X > 2) =", round(conditional_prob, 6)))
print(paste("P(X > 3) =", round(direct_prob, 6)))
**Visualizing Exponential Distributions**
.. code-block:: r
library(ggplot2)
# Compare different rate parameters
x_vals <- seq(0, 5, by = 0.1)
# Create data for plotting
plot_data <- data.frame(
x = rep(x_vals, 3),
density = c(dexp(x_vals, rate = 0.5),
dexp(x_vals, rate = 1),
dexp(x_vals, rate = 2)),
rate = rep(c("λ = 0.5", "λ = 1", "λ = 2"), each = length(x_vals))
)
# Plot PDFs
ggplot(plot_data, aes(x = x, y = density, color = rate)) +
geom_line(size = 1.2) +
labs(title = "Exponential PDFs with Different Rates",
x = "x", y = "Density", color = "Rate Parameter") +
theme_minimal()
# Plot CDFs
cdf_data <- data.frame(
x = rep(x_vals, 3),
cdf = c(pexp(x_vals, rate = 0.5),
pexp(x_vals, rate = 1),
pexp(x_vals, rate = 2)),
rate = rep(c("λ = 0.5", "λ = 1", "λ = 2"), each = length(x_vals))
)
ggplot(cdf_data, aes(x = x, y = cdf, color = rate)) +
geom_line(size = 1.2) +
labs(title = "Exponential CDFs with Different Rates",
x = "x", y = "Cumulative Probability", color = "Rate Parameter") +
theme_minimal()
**Applied Example: Equipment Failure Analysis**
.. code-block:: r
# Machine breakdowns follow Exp(rate = 1/90) per day
# Mean time between failures = 90 days
failure_rate <- 1/90 # failures per day
# Calculate key probabilities
prob_60_days <- pexp(60, rate = failure_rate, lower.tail = FALSE)
print(paste("P(machine runs ≥ 60 days) =", round(prob_60_days, 4)))
# Memoryless property: already run 30 days, what's P(run 20 more)?
prob_additional_20 <- pexp(20, rate = failure_rate, lower.tail = FALSE)
print(paste("P(run additional 20 days | already 30 days) =", round(prob_additional_20, 4)))
# Find 90th percentile (10% of machines fail after this time)
percentile_90 <- qexp(0.9, rate = failure_rate)
print(paste("90th percentile:", round(percentile_90, 1), "days"))
# Simulate maintenance schedules
set.seed(456)
failure_times <- rexp(n = 100, rate = failure_rate)
# Analyze simulation results
mean_failure_time <- mean(failure_times)
theoretical_mean <- 1/failure_rate
print(paste("Simulated mean failure time:", round(mean_failure_time, 1), "days"))
print(paste("Theoretical mean:", theoretical_mean, "days"))
# Visualize failure time distribution
failure_data <- data.frame(days = failure_times)
ggplot(failure_data, aes(x = days)) +
geom_histogram(bins = 20, fill = "lightblue", alpha = 0.7, aes(y = ..density..)) +
geom_line(data = data.frame(x = seq(0, max(failure_times), length = 100)),
aes(x = x, y = dexp(x, rate = failure_rate)),
color = "red", size = 1.2) +
labs(title = "Machine Failure Times vs Theoretical Distribution",
x = "Days Until Failure", y = "Density") +
theme_minimal()