.. _monte-carlo-fundamentals: Monte Carlo Fundamentals ======================== Monte Carlo methods transform intractable analytical problems into computational experiments, using randomness as a fundamental tool for numerical calculation. Named after the famous casino in Monaco, these methods leverage the law of large numbers to solve problems ranging from multidimensional integration to optimization, revolutionizing fields from physics to finance. At their core, Monte Carlo methods represent quantities of interest as expectations with respect to probability distributions, then approximate these expectations through repeated random sampling. This probabilistic approach often succeeds where deterministic methods fail, particularly in high-dimensional spaces where the curse of dimensionality renders traditional numerical methods impractical. .. admonition:: Road Map 🧭 :class: important β€’ **Understand** the mathematical foundations of Monte Carlo methods and their convergence properties β€’ **Master** the fundamental Monte Carlo estimator and its statistical properties β€’ **Apply** Monte Carlo integration to compute probabilities and expectations β€’ **Analyze** convergence rates and error bounds using the Central Limit Theorem What Are Monte Carlo Methods? ------------------------------ A Monte Carlo method is any computational technique that uses random sampling to solve mathematical problems that might be deterministic in nature. Formal Definition ~~~~~~~~~~~~~~~~~ **Definition**: A Monte Carlo method is any computational technique that represents a quantity of interest as an expectation with respect to a specified probability law and then approximates that quantity using repeated pseudo-random draws from that law, with guarantees supplied by probabilistic limit theorems. Mathematically, if the quantity of interest can be expressed as: .. math:: I = E[h(X)] = \int h(x) f_X(x) dx where f_X is a probability density, then a Monte Carlo estimator with n samples x_1, x_2, ..., x_n drawn i.i.d. from F_X is: .. math:: \hat{I}_n = \frac{1}{n} \sum_{i=1}^{n} h(x_i) **Convergence**: By the Strong Law of Large Numbers: .. math:: \hat{I}_n \xrightarrow{a.s.} I \quad \text{as } n \to \infty **Error Analysis**: By the Central Limit Theorem, if Var[h(X)] < ∞: .. math:: \sqrt{n}(\hat{I}_n - I) \xrightarrow{d} N(0, \sigma^2) where σ² = Var[h(X)]. Python Implementation ~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python import numpy as np import matplotlib.pyplot as plt from scipy import stats class MonteCarloEstimator: """ Generic Monte Carlo estimator with convergence diagnostics. """ def __init__(self, h_func, sampler, true_value=None): """ Parameters ---------- h_func : callable Function h in E[h(X)] sampler : callable Function that returns samples from distribution of X true_value : float, optional True value of E[h(X)] if known """ self.h = h_func self.sampler = sampler self.true_value = true_value # Storage for convergence analysis self.estimates = [] self.sample_sizes = [] self.std_errors = [] def estimate(self, n_samples, batch_size=100): """ Compute Monte Carlo estimate with batch updates. Parameters ---------- n_samples : int Total number of samples batch_size : int Size of batches for updating estimates Returns ------- estimate : float Final Monte Carlo estimate std_error : float Standard error of estimate """ n_batches = n_samples // batch_size current_sum = 0 current_sum_sq = 0 for batch in range(n_batches): # Generate batch of samples samples = np.array([self.sampler() for _ in range(batch_size)]) h_values = np.array([self.h(x) for x in samples]) # Update running sums current_sum += np.sum(h_values) current_sum_sq += np.sum(h_values**2) # Current estimates n_current = (batch + 1) * batch_size current_mean = current_sum / n_current current_var = (current_sum_sq / n_current) - current_mean**2 current_se = np.sqrt(current_var / n_current) # Store for convergence analysis self.estimates.append(current_mean) self.sample_sizes.append(n_current) self.std_errors.append(current_se) return current_mean, current_se def plot_convergence(self): """Visualize convergence of Monte Carlo estimates.""" fig, axes = plt.subplots(1, 2, figsize=(12, 5)) # Left: Estimate vs sample size axes[0].plot(self.sample_sizes, self.estimates, 'b-', alpha=0.7) if self.true_value is not None: axes[0].axhline(self.true_value, color='r', linestyle='--', label='True value') # Add confidence bands estimates = np.array(self.estimates) std_errors = np.array(self.std_errors) axes[0].fill_between(self.sample_sizes, estimates - 1.96*std_errors, estimates + 1.96*std_errors, alpha=0.2, label='95% CI') axes[0].set_xlabel('Number of samples') axes[0].set_ylabel('Estimate') axes[0].set_title('Monte Carlo Convergence') axes[0].set_xscale('log') axes[0].legend() axes[0].grid(True, alpha=0.3) # Right: Standard error vs sample size axes[1].loglog(self.sample_sizes, std_errors, 'b-', alpha=0.7) # Theoretical 1/√n rate n_array = np.array(self.sample_sizes) theoretical_se = std_errors[0] * np.sqrt(n_array[0] / n_array) axes[1].loglog(n_array, theoretical_se, 'r--', label='1/√n rate', alpha=0.7) axes[1].set_xlabel('Number of samples') axes[1].set_ylabel('Standard error') axes[1].set_title('Standard Error Decay') axes[1].legend() axes[1].grid(True, alpha=0.3) plt.tight_layout() plt.show() Historical Context ~~~~~~~~~~~~~~~~~~ Monte Carlo methods emerged from the Manhattan Project in the 1940s, where Stanislaw Ulam, John von Neumann, and Nicholas Metropolis developed these techniques to study neutron diffusion. The evolution includes: - **1777**: Buffon's Needle - earliest documented use of randomness for Ο€ estimation - **1940s**: Modern Monte Carlo born at Los Alamos - **1953**: Metropolis Algorithm - foundation of MCMC - **1970s**: Development of variance reduction techniques - **1990s**: Sequential Monte Carlo (particle filters) - **2000s**: Quantum Monte Carlo advances - **2020s**: Neural network-enhanced Monte Carlo methods .. admonition:: Example πŸ’‘: Buffon's Needle Problem :class: note **Given**: Parallel lines distance d apart, needle of length l < d **Find**: Probability needle crosses a line when dropped randomly **Solution**: Mathematical analysis yields: .. math:: P(\text{crosses}) = \frac{2l}{\pi d} Monte Carlo simulation: .. code-block:: python def buffon_needle_simulation(l=0.5, d=1.0, n_drops=10000): """Estimate Ο€ using Buffon's needle.""" crosses = 0 for _ in range(n_drops): # Random position of needle center y = np.random.uniform(0, d/2) # Random angle of needle theta = np.random.uniform(0, np.pi) # Check if needle crosses line if y <= (l/2) * np.sin(theta): crosses += 1 # Estimate Ο€ from crossing probability p_cross = crosses / n_drops pi_estimate = 2 * l / (d * p_cross) if p_cross > 0 else np.inf return pi_estimate, p_cross # Run simulation np.random.seed(42) pi_est, p_cross = buffon_needle_simulation(n_drops=100000) print(f"Ο€ estimate: {pi_est:.6f}") print(f"True Ο€: {np.pi:.6f}") print(f"Error: {abs(pi_est - np.pi):.6f}") **Result**: Monte Carlo recovers Ο€ β‰ˆ 3.14159 from purely geometric random experiments. Computing Probabilities as Expectations ---------------------------------------- Any probability can be expressed as an expectation using indicator functions, making Monte Carlo methods universally applicable for probability estimation. Indicator Function Approach ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For any event A: .. math:: P(A) = E[1_A(X)] = \int 1_A(x) f_X(x) dx where 1_A is the indicator function. .. code-block:: python def monte_carlo_probability(event_func, sampler, n_samples=10000): """ Estimate probability using Monte Carlo. Parameters ---------- event_func : callable Function that returns True if event occurs sampler : callable Function that generates samples from distribution n_samples : int Number of Monte Carlo samples Returns ------- p_estimate : float Estimated probability std_error : float Standard error of estimate """ indicators = np.array([event_func(sampler()) for _ in range(n_samples)]) p_estimate = np.mean(indicators) # Standard error for Bernoulli std_error = np.sqrt(p_estimate * (1 - p_estimate) / n_samples) return p_estimate, std_error # Example: P(X + Y > 1) where X, Y ~ Uniform(0, 1) def sampler_xy(): return np.random.uniform(0, 1, 2) def event_sum_greater_one(xy): return xy[0] + xy[1] > 1 p_est, se = monte_carlo_probability(event_sum_greater_one, sampler_xy, 50000) print(f"P(X + Y > 1) β‰ˆ {p_est:.4f} Β± {1.96*se:.4f}") print(f"True value: 0.5000") Monte Carlo Integration ----------------------- Monte Carlo integration excels where traditional quadrature fails: high dimensions and complex integration regions. Basic Monte Carlo Integration ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To compute: .. math:: I = \int_a^b g(x) dx we can write: .. math:: I = (b - a) \int_a^b g(x) \frac{1}{b-a} dx = (b - a) \cdot E[g(U)] where U ~ Uniform(a, b). .. code-block:: python def monte_carlo_integrate(g, a, b, n_samples=10000): """ Monte Carlo integration over [a, b]. Parameters ---------- g : callable Function to integrate a, b : float Integration bounds n_samples : int Number of samples Returns ------- integral : float Estimated integral std_error : float Standard error """ # Sample uniformly from [a, b] x = np.random.uniform(a, b, n_samples) # Evaluate function g_values = g(x) # Monte Carlo estimate integral = (b - a) * np.mean(g_values) # Standard error variance = np.var(g_values) std_error = (b - a) * np.sqrt(variance / n_samples) return integral, std_error Multidimensional Integration ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Monte Carlo shines in high dimensions where the curse of dimensionality cripples deterministic methods. **Key Insight**: Monte Carlo error is O(1/√n) independent of dimension, while deterministic quadrature is O(n^(-k/d)) where k depends on smoothness and d is dimension. .. admonition:: Example πŸ’‘: Volume of High-Dimensional Sphere :class: note **Given**: Unit sphere in d dimensions **Find**: Volume using Monte Carlo **Solution**: The volume is the integral of the indicator function over the hypercube: .. code-block:: python def sphere_volume_monte_carlo(d, n_samples=100000): """Estimate volume of unit sphere in d dimensions.""" # Generate points in [-1, 1]^d hypercube points = np.random.uniform(-1, 1, (n_samples, d)) # Check which points are inside unit sphere distances_squared = np.sum(points**2, axis=1) inside_sphere = distances_squared <= 1 # Volume = volume of hypercube Γ— fraction inside hypercube_volume = 2**d fraction_inside = np.mean(inside_sphere) volume_estimate = hypercube_volume * fraction_inside # Standard error std_error = hypercube_volume * np.sqrt( fraction_inside * (1 - fraction_inside) / n_samples ) return volume_estimate, std_error # Compare with analytical formula from scipy.special import gamma def true_sphere_volume(d): return np.pi**(d/2) / gamma(d/2 + 1) # Test for various dimensions dimensions = [2, 3, 5, 10, 20] for d in dimensions: est, se = sphere_volume_monte_carlo(d, n_samples=100000) true_val = true_sphere_volume(d) rel_error = abs(est - true_val) / true_val print(f"d={d:2d}: Est={est:.4f} Β± {se:.4f}, " f"True={true_val:.4f}, RelErr={rel_error:.2%}") **Result**: Monte Carlo maintains similar relative error across dimensions, unlike grid-based methods. Importance Sampling Integration ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ When the integrand has high variance, importance sampling can dramatically improve efficiency: .. math:: I = \int g(x) dx = \int \frac{g(x)}{q(x)} q(x) dx = E_q\left[\frac{g(X)}{q(X)}\right] where q is the importance distribution. .. code-block:: python def importance_sampling_integrate(g, importance_sampler, importance_pdf, n_samples=10000): """ Monte Carlo integration using importance sampling. Parameters ---------- g : callable Function to integrate importance_sampler : callable Generates samples from importance distribution importance_pdf : callable PDF of importance distribution """ samples = np.array([importance_sampler() for _ in range(n_samples)]) # Importance weights weights = g(samples) / importance_pdf(samples) # Handle potential infinities/NaNs weights = weights[np.isfinite(weights)] integral = np.mean(weights) std_error = np.std(weights) / np.sqrt(len(weights)) return integral, std_error Variance and Convergence Analysis ---------------------------------- Understanding the statistical properties of Monte Carlo estimators is crucial for practical applications. Variance of Monte Carlo Estimators ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For the basic estimator: .. math:: \text{Var}[\hat{I}_n] = \frac{1}{n} \text{Var}[h(X)] = \frac{\sigma^2}{n} The root mean square error (RMSE) decreases as O(1/√n): .. math:: \text{RMSE} = \sqrt{E[(\hat{I}_n - I)^2]} = \frac{\sigma}{\sqrt{n}} Confidence Intervals ~~~~~~~~~~~~~~~~~~~~ Using the CLT, approximate confidence intervals are: .. math:: \hat{I}_n \pm z_{\alpha/2} \cdot \frac{\hat{\sigma}}{\sqrt{n}} where αΊ‘_{Ξ±/2} is the standard normal quantile and ΟƒΜ‚ is the sample standard deviation. .. code-block:: python def monte_carlo_with_confidence(h_func, sampler, n_samples=10000, confidence=0.95): """ Monte Carlo estimation with confidence intervals. Returns ------- dict Estimate, standard error, and confidence interval """ from scipy import stats # Generate samples and evaluate samples = np.array([h_func(sampler()) for _ in range(n_samples)]) # Point estimate estimate = np.mean(samples) # Standard error std_error = np.std(samples, ddof=1) / np.sqrt(n_samples) # Confidence interval z_score = stats.norm.ppf((1 + confidence) / 2) margin_error = z_score * std_error ci_lower = estimate - margin_error ci_upper = estimate + margin_error return { 'estimate': estimate, 'std_error': std_error, 'confidence_interval': (ci_lower, ci_upper), 'margin_of_error': margin_error, 'effective_sample_size': n_samples } Effective Sample Size ~~~~~~~~~~~~~~~~~~~~~ When samples are correlated (e.g., from MCMC), the effective sample size is: .. math:: n_{\text{eff}} = \frac{n}{1 + 2\sum_{k=1}^{\infty} \rho_k} where ρ_k is the autocorrelation at lag k. .. code-block:: python def effective_sample_size(samples, max_lag=None): """ Compute effective sample size for correlated samples. Parameters ---------- samples : array Sample values (possibly correlated) max_lag : int Maximum lag for autocorrelation calculation """ n = len(samples) if max_lag is None: max_lag = min(int(n/4), 1000) # Compute autocorrelations mean = np.mean(samples) c0 = np.mean((samples - mean)**2) sum_autocorr = 0 for k in range(1, max_lag): ck = np.mean((samples[:-k] - mean) * (samples[k:] - mean)) rho_k = ck / c0 # Stop when autocorrelation becomes negligible if abs(rho_k) < 0.01: break sum_autocorr += rho_k # Effective sample size n_eff = n / (1 + 2 * sum_autocorr) return n_eff Practical Considerations ------------------------ Implementing Monte Carlo methods in production requires attention to reproducibility, parallelization, and numerical stability. Random Seed Management ~~~~~~~~~~~~~~~~~~~~~~ .. admonition:: Common Pitfall ⚠️ :class: warning Using global random seeds can lead to non-reproducible results in parallel computing environments. **Solution**: Use independent random number generators: .. code-block:: python class MonteCarloRunner: def __init__(self, base_seed=None): self.base_seed = base_seed or np.random.randint(2**32) def run_parallel(self, func, n_jobs, n_samples_per_job): """Run Monte Carlo in parallel with reproducible seeds.""" from joblib import Parallel, delayed def worker(job_id): # Create independent RNG for this job rng = np.random.RandomState(self.base_seed + job_id) return func(n_samples_per_job, rng) results = Parallel(n_jobs=n_jobs)( delayed(worker)(i) for i in range(n_jobs) ) return self.combine_results(results) Adaptive Sample Size Determination ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python def adaptive_monte_carlo(h_func, sampler, target_error=0.01, initial_n=1000, max_n=10**7): """ Adaptively determine sample size to achieve target error. """ current_n = initial_n all_samples = [] while current_n <= max_n: # Generate new batch new_samples = [h_func(sampler()) for _ in range(initial_n)] all_samples.extend(new_samples) # Current estimate and error estimate = np.mean(all_samples) std_error = np.std(all_samples) / np.sqrt(len(all_samples)) # Check if target accuracy achieved if std_error < target_error: return estimate, std_error, len(all_samples) current_n = len(all_samples) # Max samples reached return estimate, std_error, len(all_samples) Bringing It All Together ------------------------- Monte Carlo methods provide a universal computational framework for solving problems involving uncertainty, integration, and optimization. Their power lies in transforming deterministic problems into stochastic simulations where the law of large numbers guarantees convergence. .. admonition:: Key Takeaways πŸ“ :class: important 1. **Core concept**: Represent any quantity as an expectation E[h(X)] and approximate via sampling 2. **Computational insight**: Error decreases as O(1/√n) independent of dimension 3. **Practical application**: Essential for high-dimensional problems where deterministic methods fail 4. **Connection**: Foundation for all simulation-based methods including MCMC and particle filters Exercises ~~~~~~~~~ 1. **Conceptual Understanding**: Prove that the Monte Carlo estimator is unbiased and derive its variance. Show that the variance of the sample variance estimator is O(1/n). 2. **Implementation**: Implement Monte Carlo integration for the Gaussian integral: .. math:: I = \int_{-\infty}^{\infty} e^{-x^2/2} dx = \sqrt{2\pi} Use importance sampling with a Cauchy distribution and compare with naive sampling. Plot the convergence for both methods. 3. **Analysis**: The Genz test functions are standard benchmarks for multidimensional integration. Implement Monte Carlo integration for the oscillatory Genz function: .. math:: f(x) = \cos\left(2\pi u_1 + \sum_{i=1}^d a_i x_i\right) over [0,1]^d. Study how the variance depends on dimension d and frequency parameters a_i. 4. **Extension**: Implement a quasi-Monte Carlo method using Sobol sequences instead of pseudo-random numbers. Compare convergence rates with standard Monte Carlo for: a) Integration over the unit cube in various dimensions b) Option pricing using the Black-Scholes model Explain when quasi-Monte Carlo outperforms standard Monte Carlo.