Homework Assignments

Overview 

Homework assignments in STAT 418 are designed to reinforce theoretical understanding through rigorous mathematical derivation and build computational fluency through Python implementation. Each assignment is worth 100 points and integrates material from the corresponding textbook chapters.

Assignments typically involve a combination of:

Hand derivations: Proving distributional identities, deriving estimator properties, computing moments and MGFs
Code implementations: Sampling algorithms, Monte Carlo estimators, statistical tests
Graphing and visualization: Distribution comparisons, convergence diagnostics, Q-Q plots, histograms
Written explanations: Interpreting results, comparing methods, explaining why certain phenomena occur

Assignment Policies

Weighting: Homework constitutes 40% of the final grade
Points: Each assignment is worth 100 points
Submissions: 6–7 assignments throughout the semester; lowest score dropped
Cadence: Approximately 2-week cycles
Late Policy: Up to 3 days late with 20% penalty; no credit after 3 days
Collaboration: Discussion encouraged; all submitted work must be your own
AI Tools: Permitted for debugging, study, converting handwritten work to LaTeX/Word, and resource discovery; prohibited for generating solutions directly; all AI assistance must be disclosed

Recommended Workflow

I strongly encourage the following workflow for completing assignments:

Step-by-Step Approach

1. Do All Hand Derivations on Paper or iPad First

Work through all mathematical derivations using pencil and paper or a tablet before touching a keyboard. This forces you to think through each step carefully and builds mathematical intuition that typing directly into LaTeX does not provide. There’s no substitute for working by hand first.

2. Use an LLM to Convert Your Handwritten Work to LaTeX or Word

Once your derivations are complete and you’re confident they’re correct, use an LLM like ChatGPT or Claude to convert your handwritten work into a nicely formatted document. Simply photograph your work and ask the LLM to typeset it in LaTeX or format it for Word. Review the output carefully—LLMs occasionally misread handwriting or introduce transcription errors.

3. Add Images and Additional Explanations

Insert any figures generated by your Python code into the appropriate locations in your document. Add written explanations, interpretations of results, and responses to conceptual questions. Make sure figures have descriptive captions.

4. Submit Your Python Code as a Separate File

Your computational work should be in a standalone .py or .ipynb file that runs independently and reproduces all results and figures in your PDF.

Submission Requirements

Each homework submission consists of two separate files:

Written Work (PDF): A single PDF containing your derivations, explanations, embedded figures, and responses to all problems. This can be created using:
- LaTeX (recommended for complex math)
- Microsoft Word or Google Docs with equation editor
- Handwritten work converted via LLM to LaTeX/Word
- Any combination of the above
Upload to Gradescope and match pages to problems.
Python Code (separate file): A single .py or .ipynb file containing all your computational work. Your code should:
- Run without errors from top to bottom
- Be organized by problem number with clear comments
- Use explicit random seeds for reproducibility
- Generate all figures that appear in your PDF

Reproducibility Requirement

All stochastic code must use explicit seeds (e.g., np.random.default_rng(42)) so that results can be exactly reproduced. When we run your code, it should generate the same figures and numerical results shown in your PDF.

Assignments by Chapter 

Part I: Foundations

Foundations & Simulation

Homework 1: Distributional Relationships and Computational Foundations

Homework 1: Distributional Relationships and Computational Foundations

Covers Chapters 1.1–1.3. Students prove distributional identities using MGFs and transformation methods, then verify results computationally. Part III explores the CLT empirically, quantifies Monte Carlo error, and demonstrates proper random number generation practices.

Key topics: MGF multiplication property, Jacobian transformations, CLT convergence rates, Monte Carlo standard error, reproducibility and parallel streams.

Tips for Success 

Mathematical Derivations

Always start on paper: I cannot emphasize this enough—do your derivations by hand first. The physical act of writing helps you think through each step carefully and catch errors early.
Start early: Proofs often require multiple attempts and fresh perspectives; don’t leave them for the night before
Define notation: Clearly state what each symbol represents before using it
Justify each step: Cite theorems, state assumptions, explain non-obvious equalities
Check special cases: Verify your result gives sensible answers for known parameter values (e.g., does your MGF equal 1 when t=0?)
Work forwards and backwards: Sometimes it helps to start from the desired result and work backwards
Use LLMs for typesetting only: Have the LLM convert your completed handwritten work to LaTeX—don’t ask it to solve the problem for you

Computational Verification

Read tips carefully: Each problem includes specific guidance on SciPy parameterizations and visualization approaches
Use the provided utilities: The compare_distributions() function handles common visualization patterns
Start with small samples: Debug with n=100 before scaling to n=100,000
Print intermediate results: Verify each step before combining into larger computations
Compare to theory: Always check that sample means/variances match theoretical values
Save your figures: Use plt.savefig('problem1.png', dpi=150) to save figures for your PDF

Common Pitfalls

Watch Out For

SciPy parameterizations: Different conventions for scale vs. rate, shape parameters, etc.
Off-by-one errors: Geometric and Negative Binomial have multiple conventions
Numerical precision: Use np.log1p(x) instead of np.log(1+x) for small x
Random state: Forgetting to set seeds makes debugging nearly impossible
Vectorization: Loops over samples are 50–100× slower than vectorized operations