Part II: Frequentist Inference

What would happen if we repeated this procedure many times? The frequentist paradigm interprets probability as long-run frequency: if we could repeat an experiment infinitely many times under identical conditions, the probability of an event equals the proportion of times it occurs. This deceptively simple idea—that probability lives in the repetition, not in our beliefs—generates a powerful and coherent framework for statistical inference.

The frequentist approach answers inferential questions through hypothetical repetition. A 95% confidence interval isn’t 95% likely to contain the true parameter; rather, the procedure that generated it captures the truth in 95% of repeated applications. A p-value of 0.03 doesn’t mean there’s a 3% chance the null hypothesis is true; it means that if the null were true, we’d see evidence this extreme only 3% of the time. These subtle but crucial distinctions pervade everything that follows.

What makes modern frequentist inference computational is the recognition that we can often simulate the repeated experiments that define our inferential quantities. Rather than deriving sampling distributions analytically—possible only for simple cases—we generate them directly through Monte Carlo simulation, bootstrap resampling, and permutation tests. The computer becomes a laboratory for frequentist thought experiments.


The Arc of Part II

Chapter 2: Monte Carlo Simulation establishes the computational engine that powers modern frequentist inference. We begin with a philosophical puzzle: how can deterministic algorithms produce “random” numbers? The answer—pseudo-random number generators that create sequences indistinguishable from true randomness—leads us through uniform variate generation, the inverse CDF method for arbitrary distributions, and rejection sampling for cases where inversion fails. The chapter culminates with variance reduction techniques that can improve Monte Carlo efficiency by orders of magnitude.

Chapter 3: Parametric Inference develops the classical frequentist toolkit for estimation and model building. Exponential families provide the mathematical scaffolding—a unified framework encompassing normal, binomial, Poisson, and most distributions used in practice. Maximum likelihood estimation emerges as the natural way to learn parameters from data, with beautiful asymptotic properties (consistency, efficiency, normality) that justify the confidence intervals and hypothesis tests we construct. Linear models and their generalization to non-normal responses (GLMs) extend these ideas to regression. Throughout, we emphasize both elegant theory and computational reality: Newton-Raphson iteration, Fisher scoring, and numerical optimization.

Chapter 4: Resampling Methods completes the frequentist toolkit by showing how to estimate sampling variability without parametric assumptions. The nonparametric bootstrap treats the observed sample as a stand-in for the population, simulating sampling distributions through resampling. The parametric bootstrap leverages model assumptions when justified. The jackknife provides deterministic variance and bias estimates. These methods yield confidence intervals, hypothesis tests, and bias corrections for quantities too complex for analytical treatment.

Computational Themes

Several computational motifs recur throughout Part II:

Simulation as inference. When analytical solutions are intractable, we simulate. Monte Carlo integration replaces calculus with averaging. Bootstrap resampling replaces asymptotic theory with empirical distributions. Permutation tests replace distributional assumptions with combinatorial enumeration.

The bias-variance tradeoff. Estimators balance accuracy (low bias) against stability (low variance). We see this in shrinkage estimators, cross-validation, and bootstrap bias correction—foreshadowing regularization methods in later chapters.

Numerical stability. Theoretical formulas often fail computationally. We use log-likelihoods instead of likelihoods, Welford’s algorithm instead of textbook variance formulas, and careful conditioning to avoid catastrophic cancellation.

Vectorization and efficiency. Python’s NumPy enables fast simulation through vectorized operations. This isn’t premature optimization—it’s the difference between simulations that take seconds and those that take hours.

Connections

Part I: Foundations provides the probability distributions, computational tools, and philosophical grounding that Part II builds upon. The frequentist interpretation introduced there becomes the operating framework here.

Part III: Bayesian Inference offers an alternative paradigm. Monte Carlo simulation underlies Markov chain Monte Carlo. Likelihood functions reappear as components of Bayes’ theorem. The bootstrap and posterior distributions both quantify uncertainty, though from different philosophical foundations. Understanding frequentist methods deeply makes the Bayesian alternative more meaningful—and vice versa.

Part IV: LLMs in Data Science extends the validation mindset. Cross-validation principles from Chapter 4 apply directly when evaluating LLM performance. The habit of quantifying uncertainty transfers to assessing model reliability.

Prerequisites

Part II assumes mastery of Part I material: random variables, probability distributions, expectation, the law of large numbers, and the central limit theorem. We also assume comfort with Python, NumPy, and basic calculus.

By Part II’s end, you’ll command a complete frequentist toolkit—from foundational simulation through sophisticated resampling—ready to tackle inference problems that resist analytical treatment.