Worksheet 8: Uniform and Exponential Distributions

Learning Objectives 🎯

  • Master the Uniform distribution for modeling equally likely outcomes over an interval

  • Understand the Exponential distribution for modeling waiting times

  • Apply CDFs to calculate probabilities and percentiles

  • Explore the memoryless property of the Exponential distribution

  • Connect continuous distributions to their discrete counterparts

Introduction

We have previously discussed the general form of probability density functions and cumulative distribution functions, just as we began with general probability mass functions before exploring special named distributions for discrete random variables. Among continuous random variables, certain widely used named distributions also capture common patterns and simplify many analyses. Two fundamental examples are the Uniform distribution, which represents outcomes uniformly distributed over an interval, and the Exponential distribution, which models waiting times between events occurring at a constant average rate.

This worksheet will focus on these two continuous distributions, examining their probability density functions, cumulative distribution functions, and properties such as expected value and variance. We will see how each distribution is applied in practice and how it relates to the general framework introduced earlier.

Part 1: The Uniform Distribution

The Uniform distribution is one of the most straightforward continuous distributions. It arises when the probability density is constant across a specified interval \([a, b]\), meaning that any sub-interval of fixed length within \([a, b]\) has the same probability as any other sub-interval of equal length. This simplicity makes the Uniform distribution a natural starting point for modeling phenomena where each outcome within a range is equally plausible. It is also frequently employed in simulation to generate pseudo-random numbers and serves as a non-informative prior in Bayesian analysis, reflecting no initial preference among values within the specified interval.

A continuous random variable \(X\) that is Uniform on the interval \([a, b]\) has a constant probability density function (PDF) within that range and zero outside it. Formally,

\[\begin{split}f_X(x) = \begin{cases} \frac{1}{b - a}, & a \leq x \leq b \\ 0, & \text{otherwise} \end{cases}\end{split}\]

such a random variable we denote as \(X \sim \text{Uniform}(a, b)\) where \(a\) and \(b\) are the parameters of the distribution indicating the endpoints of the support. Its corresponding cumulative distribution function (CDF) increases linearly over the support and is given by:

\[\begin{split}F_X(x) = \begin{cases} 0, & x < a \\ \frac{x - a}{b - a}, & a \leq x < b \\ 1, & x \geq b \end{cases}\end{split}\]

As before, when dealing with named distributions, we have well-established formulas for their expected values and variances in terms of their parameters. The expected value is simply the midpoint of the support and this point represents both the mean and median of the distribution:

\[E[X] = \frac{a + b}{2}\]

The variance formula is:

\[\text{Var}(X) = \frac{(b - a)^2}{12}\]

and therefore \(\text{SD}(X) = \frac{b - a}{\sqrt{12}}\). These formulas stem from integrating the distribution’s probability density function over its support and simplify many calculations, allowing us to quickly assess key characteristics of the distribution without starting from first principles every time.

Question 1: In this question, you will explore additional properties of the uniform distribution. You will see how the length of sub-intervals determines the probability assigned within the support, reflecting the constant nature of the distribution. Moreover, you will learn that percentiles for a uniform distribution are found by solving simple linear equations, leveraging the distribution’s linear cumulative distribution function. Let \(X\) be a continuous Uniform random variable over an unspecified support \([a, b]\).

  1. Use the CDF to determine a formula for computing \(P(X \leq c)\) where \(c\) is some constant in \([a, b]\).

  2. Use the complement rule and the CDF to determine a formula for computing \(P(X > c)\) where \(c\) is some constant in \([a, b]\).

  3. Use properties of CDFs to determine a formula for computing \(P(c \leq X \leq d)\) where \(c, d \in [a, b]\) with \(d > c\).

  4. Use your understanding of conditional probabilities and CDFs to determine a formula for computing \(P(X > d|X > c)\) where \(c, d \in [a, b]\) with \(d > c\).

  5. In words, explain how these formulas illustrate the fact that the Uniform distribution assigns probability in direct proportion to the length of sub-intervals within \([a, b]\).

  6. Given a continuous Uniform random variable \(X\) over the interval \([a, b]\), find the \(p\)-th percentile, \(x_p\) in \([a, b]\) for which \(P(X \leq x_p) = p\). Your answer should be a function of \(a\), \(b\) and \(p\).

  7. Now extend your understanding of the uniform distribution by determining the upper \(p\)-th percentile. In other words, find the value \(x_p\) in \([a, b]\) for which \(P(X > x_p) = p\). Your answer should be a function of \(a\), \(b\) and \((1 - p)\).

Question 2: A company is testing two independent manufacturing processes, \(A\) and \(B\), each with its cost of producing an item modeled by a Uniform distribution. Since lower costs are preferred, management wants to compare these processes under various scenarios:

  • The cost of process \(A\) follows a Uniform distribution \(X \sim \text{Uniform}(10, 30)\)

  • The cost of process \(B\) follows a Uniform distribution \(Y \sim \text{Uniform}(12, 28)\)

  1. Which process is more likely to produce an item costing at most $18?

  2. Determine the 75th percentile for each process.

  3. If you require that at least 75% of your items fall below a certain cost threshold, which process has the lower threshold?

  4. Determine the expected value and variability of each process.

  5. For which probability level \(p\) (with \(0 < p < 1\)) do the two processes yield the same \(p\)-th percentile cost?

  6. What are the advantages and disadvantages of each process? Is one process exclusively better?

R Code for Uniform Distribution:

# Define parameters for processes A and B
a_A <- 10; b_A <- 30
a_B <- 12; b_B <- 28

# Part a: P(X <= 18) for each process
prob_A_18 <- punif(18, min = a_A, max = b_A)
prob_B_18 <- punif(18, min = a_B, max = b_B)

cat("P(Cost_A <= 18) =", prob_A_18, "\n")
cat("P(Cost_B <= 18) =", prob_B_18, "\n")

# Part b: 75th percentile
percentile_75_A <- qunif(0.75, min = a_A, max = b_A)
percentile_75_B <- qunif(0.75, min = a_B, max = b_B)

cat("\n75th percentile for A:", percentile_75_A, "\n")
cat("75th percentile for B:", percentile_75_B, "\n")

# Part d: Expected value and SD
mean_A <- (a_A + b_A) / 2
mean_B <- (a_B + b_B) / 2
sd_A <- (b_A - a_A) / sqrt(12)
sd_B <- (b_B - a_B) / sqrt(12)

cat("\nProcess A: Mean =", mean_A, ", SD =", sd_A, "\n")
cat("Process B: Mean =", mean_B, ", SD =", sd_B, "\n")

# Visualization
x <- seq(5, 35, 0.1)
pdf_A <- dunif(x, min = a_A, max = b_A)
pdf_B <- dunif(x, min = a_B, max = b_B)

plot(x, pdf_A, type = "l", col = "blue", lwd = 2, ylim = c(0, 0.08),
     main = "PDF Comparison: Process A vs B",
     xlab = "Cost ($)", ylab = "Density")
lines(x, pdf_B, col = "red", lwd = 2)
legend("topright", c("Process A", "Process B"),
       col = c("blue", "red"), lwd = 2)

Part 2: The Exponential Distribution

The Exponential distribution is a cornerstone for modeling waiting times or inter-arrival times under a constant average rate. It is closely related to the Poisson distribution: when the total number of events in a fixed interval is governed by a Poisson random variable, the time between consecutive events follows an Exponential distribution. Furthermore, if we know exactly \(n\) events occur in a fixed interval \([a, b]\), then each event time is Uniformly distributed within \([a, b]\), and the number of events that land in any sub-interval \([a, c]\) follows a Binomial distribution. Furthermore, the sum of multiple independent Exponential waiting times follows a Gamma distribution, which is another key connection within the family of continuous distributions.

These relationships highlight how the Exponential distribution naturally bridges continuous waiting-time models (Poisson inter-arrivals) with discrete counts (Poisson, Binomial) and uniform spacing. In this worksheet, you will examine the Exponential distribution’s probability density function and cumulative distribution function, along with its defining memoryless property, which distinguishes it from other continuous named distributions.

An Exponential random variable \(X\) with rate parameter \(\lambda > 0\) models the waiting time (or interarrival time) until an event occurs, assuming events happen at a constant average rate and independently of each other. We denote such a variable as \(X \sim \text{Exp}(\lambda)\). Its probability density function (PDF) is:

\[\begin{split}f_X(x) = \begin{cases} \lambda e^{-\lambda x}, & x \geq 0 \\ 0, & x < 0 \end{cases}\end{split}\]

and its corresponding cumulative distribution function (CDF) is:

\[\begin{split}F_X(x) = \begin{cases} 1 - e^{-\lambda x}, & x \geq 0 \\ 0, & x < 0 \end{cases}\end{split}\]

As with other named distributions, the Exponential distribution has well-established formulas for its expected value and variance in terms of its parameter \(\lambda\):

\[ \begin{align}\begin{aligned}E[X] = \frac{1}{\lambda}\\\text{Var}(X) = \frac{1}{\lambda^2}\end{aligned}\end{align} \]

Consequently, the standard deviation is simply \(\frac{1}{\lambda}\). These results simplify many calculations and allow us to quickly assess the core characteristics of an Exponential random variable without deriving everything from first principles each time.

Sometimes we reparametrize the Exponential distribution in terms of its mean \(\mu = \frac{1}{\lambda}\), then we may write \(X \sim \text{Exp}(\mu)\). Under this parameterization, the probability density function (PDF) becomes:

\[\begin{split}f_X(x) = \begin{cases} \frac{1}{\mu} e^{-\frac{x}{\mu}}, & x \geq 0 \\ 0, & x < 0 \end{cases}\end{split}\]

In practical settings, \(\mu\) can be more intuitive: if events occur on average every 10 hours, it might feel more direct to say \(\mu = 10\) than to use \(\lambda = 0.1\) which denotes that on average 0.1 events happen every hour. Both forms describe the same Exponential distribution, but choosing the parameter that matches the natural measure of your scenario can simplify modeling and interpretation.

Question 3: In this question, you will explore additional properties of the Exponential distribution.

  1. Use the CDF and complement rule to derive a formula for \(P(X > c)\) for any \(c \geq 0\). In other words, express the probability that \(X\) exceeds a particular value \(c\) in terms of \(\lambda\) and \(c\). This probability is often referred to as a tail probability.

  2. A hallmark of the Exponential distribution is its memorylessness: once a certain amount of time has passed without an event, the additional waiting time until that event occurs has the same distribution as the original waiting time. This property is useful for modeling processes like time-to-failure of components where the past does not influence the future. Using properties of conditional probabilities and the tail probability formula show that \(X\) satisfies the memoryless property:

    \[P(X > t|X > s) = P(X > t - s), \quad \text{for any } 0 \leq s < t\]

Question 4: An elevator in the Purdue Math building experiences breakdowns with a constant average rate of 1 breakdown every 90 days. Let \(X\) denote the time (in days) until the next breakdown. We assume \(X\) follows an Exponential distribution.

  1. Determine the probability that the elevator goes at least 60 days without breaking down.

  2. Suppose the elevator has already run for 30 days without breaking down. Determine the probability of running an additional 20 days without a breakdown.

  3. Determine the top 10th percentile time; which is the point beyond which only 10% of breakdowns exceed this value.

  4. Maintenance will observe the elevator over a 180-day span, let \(Y\) denote the number of breakdowns in that period. Determine the probability that there are at least 3 breakdowns over this period.

R Code for Exponential Distribution:

# Parameters
mean_time <- 90  # days
lambda <- 1/mean_time

# Part a: P(X > 60)
prob_60 <- pexp(60, rate = lambda, lower.tail = FALSE)
# or equivalently
prob_60_alt <- exp(-lambda * 60)
cat("P(X > 60) =", prob_60, "\n")

# Part b: P(X > 50 | X > 30) using memoryless property
# This equals P(X > 20)
prob_additional_20 <- pexp(20, rate = lambda, lower.tail = FALSE)
cat("P(additional 20 days | 30 days passed) =", prob_additional_20, "\n")

# Part c: Top 10th percentile (90th percentile of distribution)
top_10_percentile <- qexp(0.9, rate = lambda)
cat("Top 10th percentile:", top_10_percentile, "days\n")

# Part d: Number of breakdowns in 180 days follows Poisson
lambda_poisson <- 180 / mean_time  # Expected breakdowns in 180 days
prob_at_least_3 <- ppois(2, lambda = lambda_poisson, lower.tail = FALSE)
cat("P(Y >= 3) =", prob_at_least_3, "\n")

# Visualization of PDF and CDF
par(mfrow = c(1, 2))

# PDF
x <- seq(0, 300, 1)
pdf <- dexp(x, rate = lambda)
plot(x, pdf, type = "l", main = "Exponential PDF (λ = 1/90)",
     xlab = "Days", ylab = "Density", lwd = 2)
abline(v = mean_time, col = "red", lty = 2)
text(mean_time + 10, 0.008, "Mean = 90", col = "red")

# CDF
cdf <- pexp(x, rate = lambda)
plot(x, cdf, type = "l", main = "Exponential CDF",
     xlab = "Days", ylab = "Cumulative Probability", lwd = 2)
abline(h = 0.5, col = "blue", lty = 2)
abline(v = qexp(0.5, rate = lambda), col = "blue", lty = 2)
text(70, 0.55, "Median", col = "blue")

Key Takeaways

Summary 📝

  • Uniform distribution assigns equal probability density over \([a, b]\)

  • Uniform CDF is linear: \(F_X(x) = \frac{x - a}{b - a}\) for \(x \in [a, b]\)

  • Exponential distribution models waiting times with constant rate \(\lambda\)

  • Exponential has the memoryless property: \(P(X > s + t|X > s) = P(X > t)\)

  • Connection: If events follow Poisson with rate \(\lambda\), inter-arrival times are Exponential(\(\lambda\))

  • Both distributions have simple formulas for mean, variance, and percentiles