Computational Methods in Data Science

This course introduces essential computational methods in modern data science, focusing on simulation, resampling, Bayesian data analysis, and the utilization of large language models (LLMs) in data science workflows. Students will learn foundational simulation techniques, such as random variable generation through the inverse cumulative distribution function and rejection sampling. The course provides an overview of Frequentist and Bayesian inference, highlighting their theoretical foundations and practical applications.

Key resampling methods, including bootstrapping and cross-validation, will be explored as tools for assessing variability, constructing confidence intervals, and validating predictive models. The course emphasizes the practical and responsible use of computational methods in data science pipelines. The course culminates in a capstone project where students will synthesize their learning by designing and implementing comprehensive solutions to real-world data science problems.

Instructor: Dr. Timothy Reese | Email: reese18@purdue.edu | Office: MATH 210 | Phone: 765-494-4129

Credits: 3.00 | Office Hours: TBA | Lecture Times: TBA

Prerequisites: Multivariate calculus (MA 26100), Mathematical probability (MA/STAT 41600), Python programming (CS 38003), Foundational statistical inference (STAT 35500). Comfortable with NumPy, Pandas, SciPy; multiple integrals; Bayes’ theorem; Central Limit Theorem; hypothesis tests and confidence intervals.

Learning Outcomes: By the end of this course, you will be able to: (1) Apply simulation techniques (Monte Carlo, transformations, rejection sampling), (2) Compare Frequentist and Bayesian inference, (3) Implement resampling methods (bootstrap, jackknife, cross-validation), (4) Construct and analyze Bayesian models via MCMC, (5) Integrate LLMs responsibly in data science workflows, (6) Synthesize methods in a capstone project addressing real-world challenges.

Assessment: Homework 40% (6-7 assignments; lowest dropped), Midterms 2×15%, Capstone 30% (proposal, progress, final). Academic integrity follows Purdue Honor Pledge. AI tools allowed for debugging and study; prohibited for turnkey solutions.