.. _6-5-uniform-distribution: .. raw:: html
Uniform Distribution ================================== After exploring the mathematical elegance and complexity of the normal distribution, we now turn to perhaps the simplest of all continuous distributions: the uniform distribution. .. admonition:: Road Map 🧭 :class: important • Understand the **uniform distribution** as representing constant probability density over a fixed interval. • Develop **geometric intuition** for probability as rectangular areas. • Learn the **mathematical formulation** of uniform PDF and CDF, and explore the key properties. The Essence of Uniform Randomness ------------------------------------ A uniform distribution assigns equal probability density to every point within its support interval :math:`[a, b]` for some :math:`a < b`. The Uniform PDF ~~~~~~~~~~~~~~~~~~~ A continuous random variable :math:`X` follows a uniform distribution on the interval :math:`[a, b]` if its probability density function is: .. math:: f_X(x) = \begin{cases} \frac{1}{b-a} & \text{for } a \leq x \leq b \\ 0 & \text{elsewhere} \end{cases} We write :math:`X \sim \text{Uniform}(a,b)` or sometimes :math:`X \sim U(a,b)`. .. figure:: https://yjjpfnblgtrogqvcjaon.supabase.co/storage/v1/object/public/stat-350-assets/images/chapter6/uniform-pdf.png :alt: The PDF plot of a uniform distribution :align: center :figwidth: 60% A Uniform PDF **Determining the Constant Height** The constant :math:`\frac{1}{b-a}` isn't arbitrary—it's the unique value that makes the PDF valid. Since the distribution forms a rectangle with base :math:`(b-a)` and height :math:`c`, the total area is :math:`\text{Area} = \text{base} \times \text{height} = (b-a) \times c`. For this to equal 1, we need: .. math:: (b-a) \times c = 1 \implies c = \frac{1}{b-a} This geometric argument shows why the height must be the reciprocal of the interval length. The Uniform CDF ~~~~~~~~~~~~~~~~~ The cumulative distribution function for a uniform distribution increases linearly from 0 to 1 over the supprt interval: .. math:: F_X(x) = \begin{cases} 0 & \text{for } x < a \\ \frac{x-a}{b-a} & \text{for } a \leq x < b \\ 1 & \text{for } x \geq b \end{cases} .. figure:: https://yjjpfnblgtrogqvcjaon.supabase.co/storage/v1/object/public/stat-350-assets/images/chapter6/uniform-cdf.png :alt: The CDF plot of a uniform distribution :align: center :figwidth: 60% A Uniform CDF **Deriving the CDF** * Before the rectangle begins, zero area is accummulated on the PDF. Thus the CDF is 0 for :math:`x < a`. * After the rectangle is complete, all existing area in the PDF has been accummulated. The CDF is 1 for :math:`x \geq b`. * For an :math:`x` between :math:`a` and :math:`b`, the CDF represents the area under the PDF from :math:`-\infty` to :math:`x`. Since the PDF is zero before point :math:`a` and constant at height :math:`\frac{1}{b-a}` from :math:`a` to :math:`x`, this area forms a rectangle: .. math:: F_X(x) = \text{width} \times \text{height} = (x-a) \times \frac{1}{b-a} = \frac{x-a}{b-a}. The Length Principle --------------------------- Since the uniform PDF has constant height :math:`\frac{1}{b-a}` across its support, the probability of any interval equals the area of the corresponding rectangle: .. math:: P(c \leq X \leq d) &= \text{width} \times \text{height} = (d-c) \times \frac{1}{b-a} \\ &= \frac{d-c}{b-a} =\frac{\text{length of interval of interest}}{\text{total length of support}} Notice that this probability depends only on the interval length :math:`(d-c)` and the total support length :math:`(b-a)`. The specific values of :math:`c` and :math:`d` don't matter—only how far apart they are. .. admonition:: Example💡: Unifrom Probabilities Consider :math:`X \sim \text{Uniform}(0, 10)`. 1. Find :math:`P(1 \leq X \leq 3)`. .. math:: P(1 \leq X \leq 3) = \frac{3-1}{10-0} = \frac{2}{10} = 0.2 2. Find :math:`P(4.7 \leq X \leq 6.7)`. .. math:: P(4.7 \leq X \leq 6.7) = \frac{6.7-4.7}{10-0} = \frac{2}{10} = 0.2 Note that the two intervals in #1 and #2 have the same length. they also share the same probability. 3. Find :math:`P(8.5 \leq X \leq 10.5)`. .. math:: P(8.5 \leq X \leq 10.5) &= P(8.5 \leq X \leq 10) + P(10 < X \leq 10.5)\\ &= \frac{1.5}{10} + 0 = 0.15 🛑 The interval in this probability statement extends beyond the support. Our calculation must reflect the fact that the probability of :math:`X` taking any value greater than 10 is zero. This is an important example that highlights the role of the support in probability calculations. Expected Value and Variance of Uniform Distribution ----------------------------------------------------------------- The uniform distribution's symmetry and rectangular shape make its expected value geometrically obvious, while its variance can be computed through either geometric reasoning or algebraic integration. Expected Value: The Midpoint ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ By symmetry, the expected value of a uniform distribution must lie at the center of its support interval: .. math:: E[X] = \mu_X = \frac{a+b}{2} **Algebraic Verification of Expected Value** We can confirm this geometric insight through integration: .. math:: E[X] &= \int_{-\infty}^{\infty} x \cdot f_X(x) \, dx = \int_a^b x \cdot \frac{1}{b-a} \, dx\\ &=\frac{1}{b-a} \int_a^b x \, dx = \frac{1}{b-a} \cdot \frac{x^2}{2}\Big|_a^b = \frac{1}{b-a} \cdot \frac{b^2-a^2}{2}\\ &= \frac{1}{b-a} \cdot \frac{(b-a)(b+a)}{2} = \frac{b+a}{2}✔ Variance ~~~~~~~~~~~~ For variance, we'll use the computational formula :math:`\text{Var}(X) = E[X^2] - (E[X])^2`. We already know :math:`E[X]`, so we need to find :math:`E[X^2]`: .. math:: E[X^2] &= \int_a^b x^2 \cdot \frac{1}{b-a} \, dx = \frac{1}{b-a} \int_a^b x^2 \, dx\\ &= \frac{1}{b-a} \cdot \frac{x^3}{3}\Big|_a^b = \frac{1}{b-a} \cdot \frac{b^3-a^3}{3} Now we use the algebraic identity :math:`b^3 - a^3 = (b-a)(b^2 + ab + a^2)`: .. math:: E[X^2] = \frac{1}{b-a} \cdot \frac{(b-a)(b^2 + ab + a^2)}{3} = \frac{b^2 + ab + a^2}{3} Finally, we compute the variance: .. math:: \text{Var}(X) &= E[X^2] - (E[X])^2 = \frac{b^2 + ab + a^2}{3} - \left(\frac{a+b}{2}\right)^2\\ &= \frac{b^2 + ab + a^2}{3} - \frac{a^2 + 2ab + b^2}{4}\\ &= \frac{4b^2 + 4ab + 4a^2 - 3a^2 - 6ab - 3b^2}{12}\\ &= \frac{b^2 - 2ab + a^2}{12} = \frac{(b-a)^2}{12} **The Result** The variance and standard deviation of a uniform distribution is: .. math:: \text{Var}(X) = \frac{(b-a)^2}{12} \quad \text{ and } \quad \sigma_X = \sqrt{\text{Var}(X)} = \frac{b-a}{\sqrt{12}} The variance depends only on the length of the support interval :math:`(b-a)`—wider intervals create more variability. .. Percentiles of Uniform Random Variables ------------------------------------------------ The linear nature of the uniform CDF makes percentile calculations remarkably straightforward. Unlike normal distributions that require tables or complex approximations, uniform percentiles can be found through simple algebra. **The Percentile Formula** To find the :math:`p` th percentile :math:`x_p` of :math:`X \sim \text{Uniform}(a,b)`, we solve: .. math:: P(X \leq x_p) = p Using the CDF: .. math:: F_X(x_p) = \frac{x_p - a}{b - a} = p Solving for :math:`x_p`: .. math:: x_p - a = p(b - a) .. math:: x_p = a + p(b - a) **Interpreting the Percentile Formula** This elegant formula shows that percentiles are **linear interpolations** between the endpoints: - When :math:`p = 0`: :math:`x_0 = a + 0(b-a) = a` (the minimum) - When :math:`p = 0.5`: :math:`x_{0.5} = a + 0.5(b-a) = \frac{a+b}{2}` (the median = mean) - When :math:`p = 1`: :math:`x_1 = a + 1(b-a) = b` (the maximum) For any value of :math:`p` between 0 and 1, we simply move that fraction of the way from :math:`a` to :math:`b`. .. admonition:: Example💡: Expectation, SD, and Percentile of a Uniform Random Variable :class: note Suppose a factory produces metal bolts with diameters that fall between 10 mm and 30 mm evenly. Find the expected value, standard deviation, and the 25th percentile of the diameters. Let :math:`X` denote the diameters of metal bolts. :math:`X \sim \text{Uniform}(10, 30)`. * :math:`E[X] = \frac{10+30}{2} = 20` * :math:`\sigma_x = \frac{30-10}{\sqrt{12}} = 5.7735` * For the 25th percentile :math:`x_{0.25}`, we must solve :math:`F_X(x_{0.25})=0.25`. .. math:: \frac{x_{0.25}-a}{b-a} =0.25 \implies x_{0.25} = 10 + 0.25(30-10) = 10 + 5 = 15 25% of the bolts produced in this factory has a diameter less than or equal to 15 mm. .. Real-World Applications and Modeling ------------------------------------ Despite its mathematical simplicity, the uniform distribution provides valuable models for numerous real-world phenomena where outcomes are equally likely across a range. **Manufacturing and Quality Control** Manufacturing processes often aim for uniform distributions in their output. Consider a machine that cuts metal rods to a target length of 100 cm. If the machine is properly calibrated but subject to random variation, the actual lengths might follow :math:`X \sim \text{Uniform}(99.5, 100.5)` cm. For such a process: - Expected length: :math:`E[X] = \frac{99.5 + 100.5}{2} = 100` cm (on target) - Standard deviation: :math:`\sigma_X = \frac{100.5 - 99.5}{\sqrt{12}} = \frac{1}{\sqrt{12}} \approx 0.289` cm - Probability within specifications [99.8, 100.2]: :math:`P(99.8 \leq X \leq 100.2) = \frac{100.2 - 99.8}{100.5 - 99.5} = \frac{0.4}{1} = 0.4` **Random Number Generation** Computer algorithms for generating random numbers typically start with uniform distributions. A "random number generator" on [0,1] produces values that should be uniformly distributed across this interval. All other probability distributions can then be constructed from uniform random variables through appropriate transformations. **Bayesian Statistics: Non-informative Priors** In Bayesian analysis, uniform distributions often serve as "non-informative" or "flat" priors when we have no reason to prefer some values over others within a reasonable range. For example, if we're estimating a parameter known to lie between 0 and 1 but have no prior information favoring any particular value, a :math:`\text{Uniform}(0,1)` prior reflects this state of ignorance. **Arrival Time Modeling** When events are scheduled to occur at random times within fixed intervals, uniform distributions provide natural models. If a bus is equally likely to arrive at any time during a 10-minute window, the arrival time follows a uniform distribution over that interval. **Measurement Rounding Errors** When continuous measurements are rounded to the nearest unit, the rounding errors follow approximately uniform distributions. If weights are recorded to the nearest pound, the actual weight could be anywhere within ±0.5 pounds of the recorded value, suggesting a uniform distribution for the error. Connecting Uniform Distributions to Other Concepts -------------------------------------------------- The uniform distribution's simplicity makes it an excellent bridge to understanding more complex probability concepts. **Relationship to Discrete Uniform Distributions** The continuous uniform distribution is the limiting case of discrete uniform distributions as the number of equally likely outcomes increases while their spacing decreases. A discrete uniform distribution on {1, 2, 3, ..., n} approaches a continuous uniform distribution on [1, n] as we add more and more intermediate values. **Connection to Probability Integral Transformation** One of the most important theoretical results in probability states that if :math:`X` has cumulative distribution function :math:`F_X`, then :math:`F_X(X)` follows a :math:`\text{Uniform}(0,1)` distribution. This connection allows us to: 1. Generate random variables from any distribution using uniform random numbers 2. Transform any continuous random variable into a uniform random variable 3. Test whether data follows a specified distribution by checking if the transformed values are uniform **Building Blocks for Other Distributions** Many other continuous distributions can be constructed from uniform distributions: - **Triangular distributions**: The sum of two independent uniform random variables follows a triangular distribution - **Beta distributions**: Ratios and products of uniform random variables can produce beta distributions under certain conditions - **Exponential distributions**: The transformation :math:`Y = -\ln(U)` converts a :math:`\text{Uniform}(0,1)` variable :math:`U` into an exponential distribution **Simulation and Monte Carlo Methods** Uniform random variables form the foundation of Monte Carlo simulation methods. By transforming uniform random numbers appropriately, we can simulate observations from virtually any probability distribution, enabling statistical analysis of complex systems that are difficult to study analytically. Properties Summary ----------------------------------- For quick reference, here are the essential properties of the uniform distribution: **Notation**: :math:`X \sim \text{Uniform}(a,b)` or :math:`X \sim U(a,b)` **Parameters**: :math:`a` (lower bound) and :math:`b` (upper bound), where :math:`a < b` **Support**: :math:`[a, b]` **PDF**: :math:`f_X(x) = \begin{cases} \frac{1}{b-a} & \text{for } a \leq x \leq b \\ 0 & \text{elsewhere} \end{cases}` **CDF**: :math:`F_X(x) = \begin{cases} 0 & \text{for } x < a \\ \frac{x-a}{b-a} & \text{for } a \leq x < b \\ 1 & \text{for } x \geq b \end{cases}` **Expected Value**: :math:`E[X] = \frac{a+b}{2}` **Variance**: :math:`\text{Var}(X) = \frac{(b-a)^2}{12}` **Standard Deviation**: :math:`\sigma_X = \frac{b-a}{\sqrt{12}}` Bringing It All Together ------------------------ .. admonition:: Key Takeaways 📝 :class: important 1. The **uniform distribution** represents constant probability density across a fixed interval, making it the simplest continuous distribution. 2. **Only interval lengths matter** for uniform probability calculations—location within the support is irrelevant. 3. The **PDF height** must be :math:`\frac{1}{b-a}` to ensure the rectangular area equals 1. 4. The **CDF increases linearly** from 0 to 1 across the support. 5. **Expected value** equals the midpoint :math:`\frac{a+b}{2}`, and **variance** equals :math:`\frac{(b-a)^2}{12}`. Exercises ~~~~~~~~~~~~ 1. **Basic Properties**: For :math:`X \sim \text{Uniform}(2, 8)`: a) Find :math:`P(3 \leq X \leq 5)` b) Find :math:`P(X > 7)` c) Find the median and verify it equals the mean d) Calculate the variance and standard deviation 2. **Manufacturing Application**: A cutting machine produces rods with lengths uniformly distributed between 49.8 and 50.2 cm. a) What's the probability a rod meets specifications [49.9, 50.1] cm? b) Find the 90th percentile of rod lengths c) If specifications are tightened to [49.95, 50.05] cm, what percentage will meet the new requirements? 3. **Conditional Probability**: For :math:`X \sim \text{Uniform}(0, 10)`: a) Find :math:`P(X > 7 | X > 5)` b) Find :math:`P(2 < X < 6 | X < 8)` c) Explain how the length principle applies to conditional probabilities 4. **Finding parameters**: If a uniform distribution has a mean of 15 and a variance of 12, find the parameters :math:`a` and :math:`b`. .. Working with Uniform Distributions in R ----------------------------------------- R provides built-in functions for working with uniform distributions, making calculations much easier than manual integration. The uniform distribution's simplicity translates into particularly straightforward R implementations. **The Four Essential R Functions** R follows its standard naming convention for the uniform distribution: - **runif()**: Generates random samples from a uniform distribution - **dunif()**: Calculates the probability density function (PDF values) - **punif()**: Calculates the cumulative distribution function (CDF values) - **qunif()**: Finds quantiles (percentiles) **Basic Function Usage** .. code-block:: r # Generate random samples from Uniform(5, 15) set.seed(123) random_values <- runif(n = 10, min = 5, max = 15) print(round(random_values, 2)) # Calculate density (constant within support) density_at_8 <- dunif(x = 8, min = 5, max = 15) print(paste("Density at x = 8:", density_at_8)) # Should be 1/10 = 0.1 # Calculate cumulative probabilities prob_less_than_12 <- punif(q = 12, min = 5, max = 15) print(paste("P(X ≤ 12) =", prob_less_than_12)) # Should be 7/10 = 0.7 # Find percentiles median_value <- qunif(p = 0.5, min = 5, max = 15) print(paste("50th percentile (median):", median_value)) # Should be 10 **Calculating Probabilities** .. code-block:: r # For X ~ Uniform(0, 20), calculate various probabilities # P(5 ≤ X ≤ 12) = P(X ≤ 12) - P(X ≤ 5) prob_between <- punif(12, min = 0, max = 20) - punif(5, min = 0, max = 20) print(paste("P(5 ≤ X ≤ 12) =", prob_between)) # P(X > 15) using lower.tail = FALSE prob_greater_15 <- punif(15, min = 0, max = 20, lower.tail = FALSE) print(paste("P(X > 15) =", prob_greater_15)) # Find the 90th percentile percentile_90 <- qunif(0.9, min = 0, max = 20) print(paste("90th percentile:", percentile_90)) **Applied Example: Manufacturing Quality Control** .. code-block:: r # Machine produces items with lengths Uniform(99.5, 100.5) cm # Specifications require lengths between 99.8 and 100.2 cm # Calculate probability of meeting specifications prob_in_spec <- punif(100.2, 99.5, 100.5) - punif(99.8, 99.5, 100.5) print(paste("Probability meeting specs:", round(prob_in_spec, 3))) # Simulate daily production set.seed(789) daily_lengths <- runif(n = 200, min = 99.5, max = 100.5) # Count items meeting specifications in_spec_count <- sum(daily_lengths >= 99.8 & daily_lengths <= 100.2) print(paste("Items in spec:", in_spec_count, "out of 200")) print(paste("Percentage in spec:", round(in_spec_count/200 * 100, 1), "%")) # Visualize production production_data <- data.frame(length = daily_lengths) ggplot(production_data, aes(x = length)) + geom_histogram(bins = 25, fill = "lightgray", color = "black", alpha = 0.7) + geom_vline(xintercept = c(99.8, 100.2), color = "red", linetype = "dashed", size = 1) + geom_vline(xintercept = c(99.5, 100.5), color = "blue", linetype = "solid", size = 1) + labs(title = "Daily Production Lengths", x = "Length (cm)", y = "Count") + theme_minimal() + annotate("text", x = 100, y = 15, label = "Specification\nLimits", color = "red") + annotate("text", x = 99.6, y = 10, label = "Production\nRange", color = "blue")