Slides 📊

6.6. Exponential Distribution

We now encounter the exponential distribution, which captures the essence of decay and waiting times.

Road Map 🧭

Understand the exponential distribution as a model for waiting times between events.
Distinguish between Poisson (counting events) and exponential (measuring length of time between events) distributions.
Master the exponential PDF and CDF.
Explore the memoryless property of exponential random variables.
Learn about the two ways of parameterizing an exponential distribution: using the rate parameter \(\lambda\) vs. the mean parameter \(\mu\).

6.6.1. From Event Counting to Waiting Times

Consider a hospital emergency room where patients arrive according to a Poisson process with an average rate of 3 patients per hour. The Poisson distribution tells us the probabilities of seeing 0, 1, 2, or more patients in any given hour. But suppose we ask instead: “Given that a patient has just arrived, how long will it be until the next one arrives?”

This waiting time follows an exponential distribution. The connection is profound: if events occur according to a Poisson process with rate \(\lambda\), then the time between consecutive events follows an exponential distribution with the same rate parameter \(\lambda\).

Why “Exponential”?

The distribution gets its name from the patterns in its probability density function. It assigns the highest probability density to small wait times and exponentially decreasing density to higher wait times. This reflects an intuitive property of many real-world processes—short wait times are much more likely than long wait times, but extremely long wait times, while rare, remain possible.

6.6.2. Mathematical Definition and Properties

The Exponential PDF

A continuous random variable \(X\) follows an exponential distribution if its probability density function is:

\[\begin{split}f_X(x) = \begin{cases} \lambda e^{-\lambda x} & \text{for } x \geq 0 \\ 0 & \text{for } x < 0 \end{cases}\end{split}\]

We write \(X \sim \text{Exp}(\lambda)\) or \(X \sim \text{Exponential}(\lambda)\).

Understanding the Components

Rate parameter \(\lambda\): Just like in Poisson distributions, \(\lambda\) represents the average number of events per unit time and is therefore always positive. Higher values of \(\lambda\) indicate more frequent events which lead to shorter expected waiting times.
Exponential decay \(e^{-\lambda x}\) creates the characteristic decreasing curve.
\(\text{supp}(X) =[0, \infty)\) because waiting times cannot be negative.
The PDF starts at its maximum value \(\lambda\) when \(x = 0\) and decreases exponentially, approaching but never reaching zero as \(x \to \infty\).

Exponential PDF showing decay pattern — Fig. 6.22 An exponential PDF

The Expoenential CDF

The cumulative distribution function requires integrating the PDF. For any \(x \geq 0\),

\[\begin{split}F_X(x) &= P(X \leq x) = \int_0^x \lambda e^{-\lambda s} \, ds = \lambda \left[ \frac{e^{-\lambda s}}{-\lambda} \right]_0^x\\ &= \lambda \left[ \frac{e^{-\lambda x} - e^0}{-\lambda} \right] = \lambda \left[ \frac{e^{-\lambda x} - 1}{-\lambda} \right] = 1 - e^{-\lambda x}\end{split}\]

Therefore, the exponential CDF is:

\[\begin{split}F_X(x) = \begin{cases} 1 - e^{-\lambda x} & \text{for } x \geq 0 \\ 0 & \text{for } x < 0 \end{cases}\end{split}\]

A Notable Property

The exponential CDF approaches 1 as \(x \to \infty\), but technically only equals 1 in the limit. This occurs because the PDF never actually reaches zero on the right tail.

6.6.3. Expected Value and Variance

Expected Value

To find \(E[X]\), we compute:

\[E[X] = \int_0^{\infty} x \cdot \lambda e^{-\lambda x} \, dx\]

This requires integration by parts with \(u = x\) and \(dv = \lambda e^{-\lambda x} dx\):

\(du = dx\)
\(v = -e^{-\lambda x}\)

\[E[X] = \left[ x (-e^{-\lambda x}) \right]_0^{\infty} - \int_0^{\infty} (-e^{-\lambda x}) \, dx\]

The first term equals zero since \(xe^{-\lambda x} \to 0\) as \(x \to \infty\) and \(x(-e^{-\lambda x})=0\) at \(x = 0\), leaving:

\[E[X] = \int_0^{\infty} e^{-\lambda x} \, dx = \left[ \frac{e^{-\lambda x}}{-\lambda} \right]_0^{\infty} = \frac{1}{\lambda}.\]

Variance

For variance, we use \(\text{Var}(X) = E[X^2] - (E[X])^2\). Finding \(E[X^2]\) requires another integration by parts, which we leave as an independent exercise.

\[E[X^2] = \int_0^{\infty} x^2 \cdot \lambda e^{-\lambda x} \, dx = \frac{2}{\lambda^2}\]

Therefore,

\[\text{Var}(X) = E[X^2] - (E[X])^2 = \frac{2}{\lambda^2} - \left(\frac{1}{\lambda}\right)^2 = \frac{2}{\lambda^2} - \frac{1}{\lambda^2} = \frac{1}{\lambda^2}.\]

Summary

For \(X \sim \text{Exp}(\lambda)\),

\[E[X] = \frac{1}{\lambda} \quad \text{and} \quad \text{Var}(X) = \frac{1}{\lambda^2}.\]

The standard deviation is \(\sigma_X = \frac{1}{\lambda}\).

Interpretation

If events occur at the rate \(\lambda\) per unit time, we expect to wait \(\frac{1}{\lambda}\) time units on average until the next event. Higher rates mean shorter waiting times, and the variance decreases as the rate increases, indicating more predictable waiting times.

6.6.4. Two Parameterizations: Rate vs. Mean

The exponential distribution can be parameterized in two equivalent ways, each emphasizing different aspects of the underlying process.

Rate Parameterization

The rate parameterization uses \(\lambda > 0\) as we’ve seen:

\[f_X(x) = \lambda e^{-\lambda x}, \quad x \geq 0\]

Here, \(\lambda\) represents the average number of events per unit time. This parameterization is natural when we think about process rates: phone calls per minute, component failures per year, or customer arrivals per hour.

Mean Parameterization

The mean parameterization uses \(\mu = \frac{1}{\lambda}\) as the parameter:

\[f_X(x) = \frac{1}{\mu} e^{-x/\mu}, \quad x \geq 0\]

Here, \(\mu\) represents the average waiting time between events. This parameterization is natural when we think about typical waiting times: average time between phone calls, mean component lifetime, or expected time between customer arrivals.

Converting Between Parameterizations

The relationship \(\mu = \frac{1}{\lambda}\) allows easy conversion:

If events occur at rate \(\lambda = 3\) per hour, the mean waiting time is \(\mu = \frac{1}{3}\) hour (20 minutes).
If the mean waiting time is \(\mu = 2\) minutes, the rate is \(\lambda = \frac{1}{2}\) events per minute (30 events per hour).

Choosing the Right Parameterization

Use the parameterization that matches your problem’s natural description.

Use rate parameterization when the problem gives or asks for rates (“failures per year”, “arrivals per hour”).
Use mean parameterization when the problem gives or asks for typical waiting times (“mean time to failure”, “average service time”).

6.6.5. The Memoryless Property

For any exponential random variable \(X\) and any positive values \(s\) and \(t\):

\[P(X > s + t \mid X > s) = P(X > t)\]

The memoryless property implies that if we’ve already waited time \(s\) without an event occurring, the probability of waiting an additional time \(t\) is the same as if we were starting fresh. The process “forgets” how long we’ve already waited—past waiting time provides no information about future waiting time.

Proving the Memoryless Property

Using the definition of conditional probability,

\[P(X > s + t \mid X > s) = \frac{P(\{X > s + t\} \cap \{X > s\})}{P(X > s)} = \frac{P(X > s + t)}{P(X > s)}\]

From the exponential CDF \(F_X(x) = 1 - e^{-\lambda x}\), we get \(P(X > x) = e^{-\lambda x}\). Using this,

\[P(X > s + t \mid X > s) = \frac{e^{-\lambda(s+t)}}{e^{-\lambda s}} = \frac{e^{-\lambda s} \cdot e^{-\lambda t}}{e^{-\lambda s}} = e^{-\lambda t} = P(X > t)✔\]

Example💡: Exponential Distribution

Customers arrive at a service desk with exponentially distributed inter-arrival times averaging 4 minutes.

What’s the probability that no customer arrives for the next 2 minutes?

Let \(Y\) denote the inter-arrival wait time. \(Y \sim \text{Exp}(\mu=4)\). The parameter is provided as an average time, so we use \(\mu\).

\[P(Y > 2) = 1 - F_Y(y) = 1 - (1 - e^{-2/4}) = e^{-2/4} = 0.6065\]
If no customer has arrived in the last 6 minutes, what’s the probability one arrives in the next minute?

Begin by setting up the probability statement. We are looking for \(P(Y < 6 + 1 | Y > 6)\). Using memoryless property of exponential \(Y\), this is equal to \(P(Y < 1)\).

\[P(Y < 1) = 1 - e^{-1/4} = 0.2212\]
Find the time \(t\) such that 90% of inter-arrival times are less than \(t\).

We need to find \(t\) such that \(F_Y(t) = P(Y \leq t) = 0.9\). Replacing the left-hand side with the CDF of \(Y\) and solving for \(t\),

\[1 - e^{-t/4} = 0.9 \implies t = -4\cdot\text{ln}(0.1)=9.21.\]

90% of the wait times at this service desk fall on or below 9.21 minutes.

6.6.6. Summary: Properties of Exponential Distribution

Notation: \(X \sim \text{Exp}(\lambda)\) or \(X \sim \text{Exponential}(\lambda)\)
Parameter: \(\lambda > 0\) (rate parameter) or \(\mu = \frac{1}{\lambda} > 0\) (mean parameter)
Support: \([0, \infty)\)
PDF: \(f_X(x) = \begin{cases} \lambda e^{-\lambda x} & \text{for } x \geq 0 \\ 0 & \text{for } x < 0 \end{cases}\)
CDF: \(F_X(x) = \begin{cases} 1 - e^{-\lambda x} & \text{for } x \geq 0 \\ 0 & \text{for } x < 0 \end{cases}\)
Expected Value: \(E[X] = \frac{1}{\lambda}\)
Variance: \(\text{Var}(X) = \frac{1}{\lambda^2}\)
Standard Deviation: \(\sigma_X = \frac{1}{\lambda}\)
Memoryless Property: \(P(X > s + t \mid X > s) = P(X > t)\) for all \(s, t > 0\)

Distinguishing Between Poisson and Exponential Random Variables

This relationship often causes confusion, so let’s go through the differences explicitly:

	Poisson Distribution	Exponential Distribution
What does it describe?	# events per unit	Time until the next event
Discrete or continuous?	Discrete with support \(\{0, 1, 2, \cdots\}\)	Continuous with support \([0, \infty)\)
Parameter	\(\lambda\) = average number of events per unit time
Typical question	What is the probability that 3 customers arrive in the next hour?	What is the probability that no customer arrives for the next 40 minutes?

Diagram of Poisson process with Poisson and exponential rvs labeled

6.6.7. Bringing It All Together

Key Takeaways 📝

The exponential distribution models waiting times between events in a Poisson process. If events occur according to a Poisson process with rate \(\lambda\), then their inter-arrival times follow \(\text{Exp}(\lambda)\).
The PDF \(f_X(x) = \lambda e^{-\lambda x}, x \geq 0\) creates an exponential decay pattern.
The CDF is \(F_X(x) = 1 - e^{-\lambda x} , x \geq 0\).
Two parameterizations exist: one using the rate \(\lambda\) (events per time) and the other using the mean \(\mu = \frac{1}{\lambda}\) (average waiting time).
The memoryless property \(P(X > s + t \mid X > s) = P(X > t)\) uniquely characterizes exponential distributions among continuous distributions.

Exercises

Basic Properties: For \(X \sim \text{Exp}(\lambda=3)\),
1. Find \(P(X > 2)\).
2. Calculate \(P(1 \leq X \leq 2)\).
3. Find the mean and standard deviation.
4. Determine the 75th percentile.
Memoryless Property: A component has exponentially distributed lifetime with mean 8 years.
1. If the component has already operated for 5 years, what’s the probability it operates for at least 3 more years?
2. Compare this to the probability that a new component operates for at least 3 years.
3. Explain why these probabilities are equal.
Parameterization Practice: Convert between parameterizations:
1. If \(X \sim \text{Exp}(\lambda = 0.25)\), what is the mean parameter \(\mu\)?
2. If the mean waiting time is 15 minutes, what is the rate parameter per hour?
3. Express the PDF using both parameterizations for part (b).
Connection to Poisson: If phone calls arrive according to a Poisson process with rate 2 calls per hour,
1. What distribution models the time between consecutive calls?
2. What’s the probability of waiting more than 45 minutes between calls?
3. If a call just ended, what’s the expected time until the next call?
Multiple Components: A system has three independent components with exponential lifetimes: rates 0.1, 0.2, and 0.3 failures per year.
1. What’s the distribution of the time until the system fails (first component failure)?
2. What’s the probability the system survives 2 years?
3. What’s the expected system lifetime?