Chapter 5: Bayesian Inference

The Bayesian approach to inference treats probability as a measure of uncertainty—about parameters, hypotheses, and predictions. Rather than asking “What would happen if we repeated this experiment?” (the frequentist question), Bayesians ask “What should we believe given this evidence?” The answer comes through Bayes’ theorem: prior beliefs, updated by data through the likelihood, yield posterior beliefs. This simple formula—known for over 250 years—generates a complete framework for learning from data.

This philosophical shift has profound practical consequences. Parameters receive probability distributions, not just point estimates. We can make direct probability statements (“There’s a 95% probability θ lies in this interval”) rather than indirect ones (“This procedure captures θ 95% of the time in repeated sampling”). Prior information—from previous studies, expert knowledge, or physical constraints—enters the analysis formally rather than being ignored or hidden. And we can update beliefs sequentially as new data arrive, with today’s posterior becoming tomorrow’s prior.

The challenge is computational. While Bayes’ theorem is simple to state, the posterior distribution it defines is often intractable—the normalizing constant requires integrating likelihood times prior over all parameter values. For decades, Bayesian methods were limited to conjugate priors where algebra yields analytical posteriors. The breakthrough came with Markov chain Monte Carlo: rather than computing the posterior analytically, we construct a Markov chain whose stationary distribution is the posterior. This chapter develops the complete Bayesian toolkit: from prior specification through conjugate models and credible intervals, to the Markov chain theory that underlies MCMC, the Metropolis-Hastings and Gibbs algorithms that implement it, convergence diagnostics that verify it worked, model comparison methods that guide model selection, and hierarchical models that let groups borrow strength from each other.

Learning Objectives: Upon completing this chapter, you will be able to:

Bayesian Foundations

  • Articulate the Bayesian interpretation of probability and contrast it with frequentist reasoning

  • Apply Bayes’ theorem to update prior beliefs given observed data

  • Explain the roles of prior, likelihood, and posterior in Bayesian inference

  • Distinguish subjective, objective, and empirical Bayes approaches

Prior Specification

  • Specify conjugate priors for exponential family likelihoods

  • Design weakly informative priors that regularize without dominating

  • Encode substantive knowledge in informative priors

  • Assess prior sensitivity through systematic variation

Posterior Computation

  • Derive analytical posteriors for conjugate models (Beta-Binomial, Normal-Normal, Gamma-Poisson)

  • Implement grid approximation for low-dimensional posteriors

  • Compute posterior summaries: means, modes, variances, and quantiles

Credible Intervals

  • Construct equal-tailed and highest posterior density (HPD) credible intervals

  • Interpret credible intervals as direct probability statements about parameters

  • Contrast Bayesian credible intervals with frequentist confidence intervals

Markov Chain Theory

  • Define Markov chains through states, transition kernels, and the Markov property

  • Derive stationary distributions and verify detailed balance

  • State ergodic theorems establishing MCMC convergence

  • Explain mixing, burn-in, and autocorrelation in Markov chains

MCMC Algorithms

  • Implement the Metropolis-Hastings algorithm with symmetric and asymmetric proposals

  • Design proposal distributions that balance acceptance rate and mixing

  • Implement Gibbs sampling by cycling through full conditional distributions

  • Compare Metropolis-Hastings, Gibbs, and hybrid strategies

Convergence Diagnostics

  • Assess convergence visually using trace plots and running mean plots

  • Compute autocorrelation and effective sample size

  • Apply formal diagnostics (Gelman-Rubin R-hat, Geweke test)

  • Identify and remedy convergence failures

Model Comparison

  • Compute Bayes factors for nested and non-nested model comparison

  • Apply information criteria (WAIC, LOO-CV) for predictive model selection

  • Perform posterior predictive checks to assess model adequacy

  • Interpret model comparison results with appropriate uncertainty

Hierarchical Models

  • Specify hierarchical models for grouped or clustered data

  • Explain shrinkage and borrowing strength across groups

  • Implement MCMC for hierarchical models with multiple parameter levels

  • Assess when hierarchical structure improves inference