.. _rejection-sampling:

Rejection Sampling
==================

When the inverse CDF method fails—either because the CDF lacks a closed form or its inverse is computationally intractable—rejection sampling offers a powerful alternative. This method, also known as the accept-reject algorithm, can generate samples from virtually any distribution by strategically accepting or rejecting candidate samples from a simpler proposal distribution.

The elegance of rejection sampling lies in its geometric interpretation: we uniformly sample points under a scaled envelope function and keep only those that fall under the target density. This approach trades computational efficiency for universality, making it indispensable in modern computational statistics, particularly in Bayesian inference and statistical physics.

.. admonition:: Road Map 🧭
   :class: important

   • **Master** the fundamental theorem of simulation and its geometric interpretation
   • **Implement** the accept-reject algorithm for univariate and multivariate distributions
   • **Optimize** the choice of proposal distribution and scaling constant
   • **Analyze** acceptance rates and computational efficiency trade-offs

Fundamental Theorem of Simulation
----------------------------------

The theoretical foundation of rejection sampling rests on a beautiful relationship between probability densities and uniform distributions over regions.

Mathematical Foundation
~~~~~~~~~~~~~~~~~~~~~~~

For any probability density function f_X on an arbitrary space, we can express it as:

.. math::

   f_X(x) = \int_0^{f_X(x)} du

This seemingly trivial identity has profound implications: f_X appears as the marginal density of the joint distribution:

.. math::

   (X, U) \sim \text{Uniform}\{(x, u) : 0 < u < f_X(x)\}

**Key Insight**: If we can sample uniformly from the region under the density curve, the x-coordinates of these points follow the target distribution.

Computational Perspective
~~~~~~~~~~~~~~~~~~~~~~~~~

The challenge is sampling uniformly from the region {(x, u) : 0 < u < f_X(x)}. The accept-reject method accomplishes this by:

1. Sampling from a larger, simpler region that encloses our target
2. Rejecting points that fall outside the target region

**Algorithm**: Basic Rejection Sampling

.. code-block:: text

   Input: Target density f(x), proposal density g(x), bound M where f(x) ≤ M·g(x)
   Output: Sample from f(x)
   
   Repeat:
      1. Generate Y ~ g(x)  [candidate sample]
      2. Generate U ~ Uniform(0, 1)
      3. If U ≤ f(Y)/(M·g(Y)):
         Accept: return X = Y
      4. Else:
         Reject: go to step 1

**Acceptance Probability**: P(accept) = 1/M

Python Implementation
~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   import numpy as np
   import matplotlib.pyplot as plt
   from scipy import stats
   
   def rejection_sampling(target_pdf, proposal_sampler, proposal_pdf, M, size=1):
       """
       Generic rejection sampling implementation.
       
       Parameters
       ----------
       target_pdf : callable
           Target probability density function f(x)
       proposal_sampler : callable
           Function that generates samples from proposal g(x)
       proposal_pdf : callable
           Proposal probability density function g(x)
       M : float
           Bound constant such that f(x) ≤ M·g(x) for all x
       size : int
           Number of samples to generate
       
       Returns
       -------
       samples : array
           Accepted samples from target distribution
       acceptance_rate : float
           Empirical acceptance rate
       """
       samples = []
       n_proposed = 0
       
       while len(samples) < size:
           # Step 1: Generate candidate
           y = proposal_sampler()
           n_proposed += 1
           
           # Step 2: Generate uniform for acceptance test
           u = np.random.uniform(0, 1)
           
           # Step 3: Accept/reject test
           acceptance_prob = target_pdf(y) / (M * proposal_pdf(y))
           
           if u <= acceptance_prob:
               samples.append(y)
       
       acceptance_rate = size / n_proposed
       return np.array(samples), acceptance_rate

Classic Example: Sampling from Beta Distribution
-------------------------------------------------

Let's implement rejection sampling for Beta(α, β) when α > 1 and β > 1, where the density has a unique mode.

Beta Distribution via Rejection
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The Beta(α, β) distribution has PDF:

.. math::

   f_X(x) = \frac{x^{\alpha-1}(1-x)^{\beta-1}}{B(\alpha, \beta)}, \quad x \in (0, 1)

where B(α, β) is the beta function. We'll use Uniform(0, 1) as our proposal distribution.

.. admonition:: Example 💡: Beta(2.5, 2.5) Sampling
   :class: note

   **Given**: Target is Beta(2.5, 2.5), symmetric around 0.5
   
   **Find**: Generate samples using uniform proposal
   
   **Solution**:
   
   First, find the maximum of the target density:
   
   .. math::
      
      \text{Mode} = \frac{\alpha - 1}{\alpha + \beta - 2} = 0.5
      
      M = f_X(0.5) = \frac{0.5^{1.5} \cdot 0.5^{1.5}}{B(2.5, 2.5)} \approx 1.875
   
   Python implementation:
   
   .. code-block:: python
   
      from scipy.special import beta as beta_function
      
      def beta_rejection_sampling(alpha, beta_param, size=1000):
          """Sample from Beta distribution using rejection sampling."""
          
          # Target PDF
          def target_pdf(x):
              if 0 < x < 1:
                  return (x**(alpha-1) * (1-x)**(beta_param-1) / 
                          beta_function(alpha, beta_param))
              return 0
          
          # Proposal: Uniform(0,1)
          proposal_sampler = lambda: np.random.uniform(0, 1)
          proposal_pdf = lambda x: 1.0 if 0 < x < 1 else 0
          
          # Find bound M (use mode for symmetric case)
          mode = (alpha - 1) / (alpha + beta_param - 2)
          M = target_pdf(mode) / proposal_pdf(mode)
          
          # Generate samples
          samples, rate = rejection_sampling(
              target_pdf, proposal_sampler, proposal_pdf, M, size
          )
          
          return samples, rate, M
      
      # Generate and visualize
      np.random.seed(42)
      samples, acceptance_rate, M = beta_rejection_sampling(2.5, 2.5, 5000)
      
      print(f"Bound M: {M:.3f}")
      print(f"Acceptance rate: {acceptance_rate:.3f}")
      print(f"Theoretical rate: {1/M:.3f}")
   
   **Result**: Acceptance rate ≈ 53.3%, meaning we reject about 47% of proposals.

Visualizing the Accept-Reject Process
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   def visualize_rejection_sampling(alpha, beta_param, n_points=1000):
       """Visualize the rejection sampling process."""
       
       fig, axes = plt.subplots(1, 2, figsize=(12, 5))
       
       # Left plot: Show envelope and target
       x = np.linspace(0.001, 0.999, 1000)
       mode = (alpha - 1) / (alpha + beta_param - 2)
       beta_pdf = stats.beta(alpha, beta_param).pdf
       M = beta_pdf(mode)
       
       axes[0].plot(x, beta_pdf(x), 'b-', label='Target: Beta(α,β)', lw=2)
       axes[0].fill_between(x, 0, beta_pdf(x), alpha=0.3)
       axes[0].axhline(y=M, color='r', linestyle='--', label=f'Envelope: M={M:.2f}')
       axes[0].fill_between(x, 0, M, alpha=0.1, color='red')
       axes[0].set_xlabel('x')
       axes[0].set_ylabel('Density')
       axes[0].set_title('Target Density and Envelope')
       axes[0].legend()
       axes[0].grid(True, alpha=0.3)
       
       # Right plot: Show accepted/rejected points
       accepted_x, accepted_y = [], []
       rejected_x, rejected_y = [], []
       
       for _ in range(n_points):
           x_prop = np.random.uniform(0, 1)
           y_prop = np.random.uniform(0, M)
           
           if y_prop <= beta_pdf(x_prop):
               accepted_x.append(x_prop)
               accepted_y.append(y_prop)
           else:
               rejected_x.append(x_prop)
               rejected_y.append(y_prop)
       
       axes[1].scatter(rejected_x, rejected_y, c='red', alpha=0.3, s=1, label='Rejected')
       axes[1].scatter(accepted_x, accepted_y, c='blue', alpha=0.5, s=1, label='Accepted')
       axes[1].plot(np.linspace(0, 1, 1000), 
                    beta_pdf(np.linspace(0, 1, 1000)), 
                    'black', lw=2)
       axes[1].set_xlabel('x')
       axes[1].set_ylabel('u')
       axes[1].set_title('Accept-Reject Visualization')
       axes[1].legend()
       axes[1].grid(True, alpha=0.3)
       
       plt.tight_layout()
       plt.show()
   
   # Visualize Beta(4, 2) - asymmetric case
   visualize_rejection_sampling(4, 2)

Optimizing Proposal Distributions
----------------------------------

The efficiency of rejection sampling critically depends on the choice of proposal distribution and the tightness of the bound M.

Choosing Good Proposals
~~~~~~~~~~~~~~~~~~~~~~~

The ideal proposal distribution g(x) should:
1. Be easy to sample from
2. Have a similar shape to the target f(x)
3. Have heavier tails than f(x) to ensure coverage

**Efficiency Metric**: The acceptance rate 1/M directly measures efficiency. Smaller M means higher acceptance.

Adaptive Rejection Sampling
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

For log-concave densities, we can construct piecewise linear envelopes that adapt during sampling:

.. code-block:: python

   class AdaptiveRejectionSampler:
       """
       Adaptive Rejection Sampling for log-concave densities.
       """
       
       def __init__(self, log_pdf, initial_points):
           """
           Initialize with log-pdf and starting abscissae.
           
           Parameters
           ----------
           log_pdf : callable
               Natural log of target density
           initial_points : array
               Initial support points for envelope
           """
           self.log_pdf = log_pdf
           self.points = np.array(initial_points)
           self.log_values = log_pdf(self.points)
           self.build_envelope()
       
       def build_envelope(self):
           """Construct piecewise linear upper envelope."""
           # Compute slopes between adjacent points
           self.slopes = np.diff(self.log_values) / np.diff(self.points)
           
           # Build piecewise exponential envelope
           # (simplified implementation)
           pass
       
       def sample(self, size=1):
           """Generate samples with adaptive envelope."""
           samples = []
           
           while len(samples) < size:
               # Sample from current envelope
               y = self.sample_from_envelope()
               
               # Accept/reject test
               u = np.random.uniform()
               if u <= np.exp(self.log_pdf(y) - self.envelope_log_pdf(y)):
                   samples.append(y)
                   
                   # Adapt: add point to envelope
                   self.add_point(y)
                   self.build_envelope()
           
           return np.array(samples)

Truncated Distributions
~~~~~~~~~~~~~~~~~~~~~~~

For sampling from truncated distributions, rejection sampling is often the method of choice:

.. code-block:: python

   def truncated_normal(mu, sigma, a, b, size=1):
       """
       Sample from truncated normal distribution on [a, b].
       
       Uses rejection sampling with uniform proposal.
       """
       from scipy.stats import norm
       
       # Find maximum of truncated normal PDF
       mode = np.clip(mu, a, b)  # Mode is at mu if in [a, b]
       max_pdf = norm.pdf(mode, mu, sigma)
       
       # Normalizing constant for truncated distribution
       Z = norm.cdf(b, mu, sigma) - norm.cdf(a, mu, sigma)
       
       samples = []
       n_attempts = 0
       
       while len(samples) < size:
           # Propose from Uniform(a, b)
           x = np.random.uniform(a, b)
           n_attempts += 1
           
           # Accept with probability proportional to PDF
           u = np.random.uniform(0, max_pdf / Z)
           
           if u <= norm.pdf(x, mu, sigma) / Z:
               samples.append(x)
       
       acceptance_rate = size / n_attempts
       return np.array(samples), acceptance_rate

Multivariate Rejection Sampling
--------------------------------

Rejection sampling extends naturally to multiple dimensions, though efficiency decreases with dimensionality.

Sampling from Arbitrary 2D Densities
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   def rejection_sampling_2d(target_pdf, x_bounds, y_bounds, M, size=1000):
       """
       Rejection sampling for 2D distributions.
       
       Parameters
       ----------
       target_pdf : callable
           Target PDF f(x, y)
       x_bounds, y_bounds : tuple
           Bounds (min, max) for each dimension
       M : float
           Bound on target PDF
       """
       samples = []
       
       while len(samples) < size:
           # Propose uniformly in rectangle
           x = np.random.uniform(*x_bounds)
           y = np.random.uniform(*y_bounds)
           
           # Height for acceptance
           u = np.random.uniform(0, M)
           
           # Accept if under surface
           if u <= target_pdf(x, y):
               samples.append([x, y])
       
       return np.array(samples)
   
   # Example: Sample from 2D mixture of Gaussians
   def mixture_2d_pdf(x, y):
       """Bimodal 2D distribution."""
       from scipy.stats import multivariate_normal
       
       # Two components
       mv1 = multivariate_normal([-1, -1], [[0.5, 0.2], [0.2, 0.5]])
       mv2 = multivariate_normal([1, 1], [[0.5, -0.2], [-0.2, 0.5]])
       
       return 0.5 * mv1.pdf([x, y]) + 0.5 * mv2.pdf([x, y])
   
   # Generate samples
   samples_2d = rejection_sampling_2d(
       mixture_2d_pdf, 
       x_bounds=(-4, 4), 
       y_bounds=(-4, 4),
       M=0.2,  # Upper bound on density
       size=2000
   )

Practical Considerations
------------------------

Implementing rejection sampling in production requires careful attention to numerical stability and efficiency.

Numerical Stability Issues
~~~~~~~~~~~~~~~~~~~~~~~~~~

.. admonition:: Common Pitfall ⚠️
   :class: warning

   Computing acceptance probabilities f(x)/(M·g(x)) can cause numerical overflow or underflow, especially with extreme parameter values.
   
   **Solution**: Work in log space when possible:
   
   .. code-block:: python
   
      log_accept_prob = log_f(x) - (log(M) + log_g(x))
      if np.log(u) <= log_accept_prob:
          accept

Curse of Dimensionality
~~~~~~~~~~~~~~~~~~~~~~~~

Rejection sampling efficiency decreases exponentially with dimension. For a d-dimensional hypercube with acceptance region volume ε:

.. math::

   \text{Acceptance Rate} \approx \epsilon^d

**Example**: Sampling from a d-dimensional sphere using a cube envelope:

.. code-block:: python

   def sphere_volume_ratio(d):
       """Volume ratio of d-sphere to d-cube."""
       from scipy.special import gamma
       # Volume of unit d-sphere / Volume of [-1,1]^d cube
       sphere_vol = np.pi**(d/2) / gamma(d/2 + 1)
       cube_vol = 2**d
       return sphere_vol / cube_vol
   
   # Show exponential decay
   dimensions = range(1, 21)
   ratios = [sphere_volume_ratio(d) for d in dimensions]
   
   plt.semilogy(dimensions, ratios, 'o-')
   plt.xlabel('Dimension')
   plt.ylabel('Acceptance Rate')
   plt.title('Curse of Dimensionality in Rejection Sampling')
   plt.grid(True)

Monitoring and Diagnostics
~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   class RejectionSamplerWithDiagnostics:
       """Rejection sampler with performance monitoring."""
       
       def __init__(self, target_pdf, proposal_sampler, proposal_pdf, M):
           self.target_pdf = target_pdf
           self.proposal_sampler = proposal_sampler
           self.proposal_pdf = proposal_pdf
           self.M = M
           
           # Diagnostics
           self.n_proposed = 0
           self.n_accepted = 0
           self.acceptance_history = []
       
       def sample(self, size=1, batch_size=100):
           """Sample with batch monitoring."""
           samples = []
           
           while len(samples) < size:
               batch_accepted = 0
               
               for _ in range(batch_size):
                   y = self.proposal_sampler()
                   self.n_proposed += 1
                   
                   u = np.random.uniform()
                   accept_prob = self.target_pdf(y) / (self.M * self.proposal_pdf(y))
                   
                   if u <= accept_prob:
                       samples.append(y)
                       self.n_accepted += 1
                       batch_accepted += 1
                       
                       if len(samples) >= size:
                           break
               
               # Record batch acceptance rate
               self.acceptance_history.append(batch_accepted / batch_size)
           
           return np.array(samples[:size])
       
       def get_diagnostics(self):
           """Return sampling diagnostics."""
           return {
               'overall_acceptance_rate': self.n_accepted / self.n_proposed,
               'theoretical_rate': 1 / self.M,
               'efficiency': (self.n_accepted / self.n_proposed) * self.M,
               'acceptance_history': self.acceptance_history
           }

Bringing It All Together
-------------------------

Rejection sampling transforms the problem of sampling from complex distributions into a geometric task of uniformly sampling regions. While less efficient than the inverse CDF method when available, its universality makes it essential for modern computational statistics.

.. admonition:: Key Takeaways 📝
   :class: important

   1. **Core concept**: Sample uniformly under the density curve by accepting/rejecting points under an envelope
   2. **Computational insight**: Trade efficiency (acceptance rate = 1/M) for universality
   3. **Practical application**: Essential for truncated distributions and when inverse CDF is unavailable
   4. **Connection**: Foundation for more sophisticated methods like importance sampling and MCMC

Exercises
~~~~~~~~~

1. **Conceptual Understanding**: Prove that the accept-reject algorithm produces samples from the target distribution. Start with the conditional probability P(Y ∈ dy | Y accepted).

2. **Implementation**: Implement rejection sampling for the truncated exponential distribution on [a, b]:
   
   .. code-block:: python
   
      def truncated_exponential(lam, a, b, size=1000):
          """
          Sample from exponential distribution truncated to [a, b].
          Compare two proposal distributions:
          1. Uniform(a, b)
          2. Exponential(lam) with rejection of values outside [a, b]
          """
          # Your implementation here
          pass
   
   Compare the efficiency of both approaches for different values of a and b.

3. **Analysis**: The von Mises distribution (circular normal) has PDF:
   
   .. math::
      
      f(\theta) = \frac{e^{\kappa \cos(\theta - \mu)}}{2\pi I_0(\kappa)}, \quad \theta \in [0, 2\pi]
   
   where I_0 is the modified Bessel function. Implement rejection sampling using:
   a) Uniform proposal on [0, 2π]
   b) Wrapped Cauchy proposal
   
   Plot acceptance rates as a function of concentration parameter κ.

4. **Extension**: Implement the "squeeze" technique that adds a lower bound to avoid evaluating the target PDF when possible:
   
   .. math::
      
      s(x) \leq f(x) \leq M \cdot g(x)
   
   Apply this to Beta(5, 2) sampling and measure the reduction in target PDF evaluations.