Worksheet 11: The Central Limit Theorem

Learning Objectives

Understand the Central Limit Theorem and its implications
Master the simulation process for generating sampling distributions
Explore how sampling distributions evolve with increasing sample size
Apply the CLT to compute probabilities for sample means and sums
Investigate the rate of convergence for different population distributions
Implement CLT simulations and visualizations in R

Introduction

In the previous worksheet, we studied sampling distributions by focusing on cases where the population itself was Normally distributed. We observed that when drawing random samples from a Normal population, the sample mean and sample sum remained Normally distributed regardless of sample size. However, real-world data often do not follow a Normal distribution, raising an important question: What happens to the sampling distribution of the sample mean when the population itself is not Normal?

The Central Limit Theorem (CLT) answers this question. It states that if we take independent random samples of size \(n\) from any population with finite mean \(\mu\) and finite standard deviation \(\sigma\), then as \(n\) increases, the standardized sample mean

\[Z_n = \frac{\bar{X} - \mu}{\sigma/\sqrt{n}}\]

and the standardized sample sum

\[Z_n = \frac{S_n - n \cdot \mu}{\sigma \cdot \sqrt{n}}\]

converge “in distribution” to a standard Normal random variable:

\[Z_n \rightarrow N(0,1), \text{ as } n \rightarrow \infty\]

This implies that for sufficiently large \(n\), the sampling distributions can be approximated using the Normal distribution:

\[\bar{X} \overset{\sim}{\approx} N\left(\mu, \frac{\sigma}{\sqrt{n}}\right)\]

\[S_n \overset{\sim}{\approx} N\left(n \cdot \mu, \sigma \cdot \sqrt{n}\right)\]

Note

The rate of convergence depends on the shape of the original population. If the population is already Normal, the sample mean follows a Normal distribution for any sample size. However, for skewed or heavy-tailed populations, a larger \(n\) is required for the Normal approximation to hold.

The CLT is fundamental because it allows us to make probabilistic statements about sample means, making statistical inference possible even when the original data are not Normally distributed.

Tutorial: Generating the Sampling Distribution of \(\bar{X}\)

Before exploring the Central Limit Theorem with different distributions, you need to understand how to generate observations from the sampling distribution of \(\bar{X}\). This tutorial will guide you through the step-by-step process.

Key Concepts Review

Before beginning, ensure you understand these concepts:

Population: The entire group from which we sample, characterized by distribution parameters
Sample: A subset of size \(n\) drawn from the population
Sampling Distribution: The probability distribution of a statistic (like \(\bar{X}\)) across all possible samples
Central Limit Theorem: States that \(\bar{X}\) approaches a Normal distribution as \(n\) increases

The Simulation Process Overview

To generate the sampling distribution of \(\bar{X}\), we need to:

Take many samples of size \(n\) from the population
Calculate \(\bar{x}\) for each sample
Examine the distribution of these sample means

Note

We denote SRS = 1500 as the number of repeated samples we’ll take. Each sample contains \(n\) observations.

Step-by-Step Example: Standard Normal Distribution

Let’s assume \(X \sim \text{Normal}(\mu=0, \sigma=1)\) and we want to explore the sampling distribution of \(\bar{X}\) for sample size \(n=10\).

Step 1: Define the Parameters

First, we define how many samples we’ll take and the size of each sample.

# Set parameters
SRS <- 1500  # Number of repeated samples (do not change)
n <- 10      # Sample size for each sample

cat("We will take", SRS, "samples, each of size", n, "\n")
cat("Total observations needed:", SRS * n, "\n")

Step 2: Generate All Required Observations

For each of the 1500 observations of \(\bar{X}\), we must first obtain \(n\) observations from \(X\) and take their mean. Thus, we need \(\text{SRS} \times n\) total observations from \(X\).

# Generate all observations at once
data.vec <- rnorm(SRS * n, mean = 0, sd = 1)

# Check the length
cat("Length of data.vec:", length(data.vec), "\n")

What we have now:

data.vec = [X₁, X₂, X₃, ..., X₁₅₀₀₀]

This is a single vector containing SRS × n = 15,000 random observations

Step 3: Organize Data into Samples

We now split data.vec into SRS = 1500 rows, each of size \(n = 10\). Each row will represent one sample.

# Reorganize into matrix where each row is one sample
data.mat <- matrix(data.vec, nrow = SRS)

# Check dimensions
cat("Dimensions of data.mat:", dim(data.mat), "\n")
cat("Number of rows (samples):", nrow(data.mat), "\n")
cat("Number of columns (n):", ncol(data.mat), "\n")

What we have now:

data.mat:

                 n observations
           ┌──────────────────────────────┐
     Row 1 │ X₁,₁   X₁,₂   ...   X₁,₁₀    │ ← Sample 1
     Row 2 │ X₂,₁   X₂,₂   ...   X₂,₁₀    │ ← Sample 2
SRS   ...  │          ...                 │
     Row   │                              │
     1500  │ X₁₅₀₀,₁ X₁₅₀₀,₂ ... X₁₅₀₀,₁₀ │ ← Sample 1500
           └──────────────────────────────┘

Understanding the matrix() Function

The matrix() function takes a vector and reorganizes it into a matrix:

Primary input: A vector of data
nrow: Number of rows (samples)
ncol: Number of columns (can be specified instead of nrow)
Behavior: Fills the matrix column-by-column by default

When the vector length is a multiple of nrow, specifying nrow automatically determines ncol.

Step 4: Calculate Sample Means

Now we calculate the mean of each row (each sample) to get our 1500 sample means.

# Calculate mean of each row
avg <- apply(data.mat, MARGIN = 1, FUN = mean)

# Check the result
cat("Length of avg:", length(avg), "\n")
cat("First 15 sample means:\n")
print(head(avg, 15))

What we have now:

avg = [x̄₁, x̄₂, x̄₃, ..., x̄₁₅₀₀]

A vector of 1500 sample means, one from each sample

Understanding the apply() Function

The apply() function applies a function to rows or columns of a matrix:

Primary input: A matrix
MARGIN = 1: Apply function to each row
MARGIN = 2: Apply function to each column
FUN: The function to apply (e.g., mean, sum, var)

Result: A vector where each element is the function applied to one row (or column)

Step 5: Calculate Experimental Statistics

Now we can calculate the mean and standard deviation of our 1500 sample means and compare them to theoretical values.

# Calculate experimental mean and SD of the sample means
mean_of_sample_means <- mean(avg)
sd_of_sample_means <- sd(avg)

cat("Experimental mean of sample means:", round(mean_of_sample_means, 4), "\n")
cat("Experimental SD of sample means:", round(sd_of_sample_means, 4), "\n")

# Compare with theoretical values
theoretical_mean <- 0  # E[X̄] = μ
theoretical_sd <- 1/sqrt(n)  # SD(X̄) = σ/√n

cat("\nTheoretical mean:", theoretical_mean, "\n")
cat("Theoretical SD:", round(theoretical_sd, 4), "\n")

cat("\nDifference in means:", abs(mean_of_sample_means - theoretical_mean), "\n")
cat("Difference in SDs:", abs(sd_of_sample_means - theoretical_sd), "\n")

Note

Key Theoretical Relationships:

Expected value: \(E[\bar{X}] = E[X] = \mu\)
Variance: \(\text{Var}(\bar{X}) = \text{Var}(X)/n = \sigma^2/n\)
Standard deviation: \(\text{SD}(\bar{X}) = \sigma/\sqrt{n}\)

Step 6: Visualize the Sampling Distribution

Create a histogram with overlaid density curves to visualize how well the sampling distribution matches a Normal distribution.

library(ggplot2)

# Create histogram with density overlay
title <- paste("Sampling Distribution of X̄ (n =", n, ")")

ggplot(data.frame(avg = avg), aes(x = avg)) +
  geom_histogram(aes(y = after_stat(density)),
                 bins = 30, fill = "grey", col = "black") +
  geom_density(col = "red", linewidth = 1) +  # Empirical density
  stat_function(fun = dnorm,
                args = list(mean = xbar, sd = s),
                col = "blue", linewidth = 1) +  # Normal overlay
  ggtitle(title) +
  xlab("Sample Mean") +
  ylab("Density") +
  theme_minimal()

Step 7: Create a Q-Q Plot

A Q-Q (quantile-quantile) plot helps assess whether the data follows a Normal distribution.

# Create Q-Q plot to assess normality
ggplot(data.frame(avg = avg), aes(sample = avg)) +
  stat_qq() +
  geom_abline(slope = s, intercept = xbar,
              col = "red", linewidth = 1) +
  ggtitle(paste("Normal Q-Q Plot (n =", n, ")")) +
  xlab("Theoretical Quantiles") +
  ylab("Sample Quantiles") +
  theme_minimal()

Interpreting the Results

Histogram with Overlays:

Gray bars: The empirical distribution of your 1500 sample means
Red curve: Kernel density estimate (smooth version of histogram)
Blue curve: Normal distribution with experimental mean and SD

If the CLT applies well, all three should align closely.

Q-Q Plot:

Points: Your sample means plotted against theoretical Normal quantiles
Red line: Reference line for perfect normality
Interpretation:
- Points following the line indicates Normal distribution
- S-shaped pattern indicates skewed distribution
- Points deviating at ends indicates heavy or light tails

Warning

Common Mistakes:

Forgetting to multiply SRS * n when generating initial data
Using wrong MARGIN in apply() (use 1 for row means, 2 for column means)
Comparing experimental statistics to wrong theoretical values
Not using nrow = SRS in the matrix() call
Using ncol instead of nrow when organizing samples

Additional Examples: Working with Different Statistics

Example: Sample Sum Instead of Sample Mean

To calculate the sample sum instead of the sample mean, change Step 4:

# Calculate sum of each row (instead of mean)
sample_sums <- apply(data.mat, MARGIN = 1, FUN = sum)

# Theoretical parameters for sample sum
theoretical_mean_sum <- n * 0  # n × μ
theoretical_sd_sum <- 1 * sqrt(n)  # σ × √n

Example: Sample Variance

To calculate the sample variance for each sample:

# Calculate variance of each row
sample_vars <- apply(data.mat, MARGIN = 1, FUN = var)

# Note: Theoretical parameters for variance are more complex
# E[S²] = σ², but the distribution converges more slowly

Example: Different Distribution Functions

For different population distributions, only Step 2 changes:

# Uniform distribution
data.vec <- runif(SRS * n, min = 0, max = 1)

# Gamma distribution
data.vec <- rgamma(SRS * n, shape = alpha, scale = beta)

# Exponential distribution
data.vec <- rexp(SRS * n, rate = lambda)

# Poisson distribution
data.vec <- rpois(SRS * n, lambda = lambda)

All other steps remain the same!

Example: Theoretical Parameters for Different Distributions

For Uniform(a, b):

theoretical_mean <- (a + b) / 2
theoretical_sd <- sqrt((b - a)^2 / 12) / sqrt(n)

For Gamma(α, β):

theoretical_mean <- alpha * beta
theoretical_sd <- sqrt(alpha) * beta / sqrt(n)

For Exponential(λ):

theoretical_mean <- 1 / lambda
theoretical_sd <- (1 / lambda) / sqrt(n)

For Poisson(λ):

theoretical_mean <- lambda
theoretical_sd <- sqrt(lambda) / sqrt(n)

Example: Creating a Reusable Function

You can wrap the entire process in a function:

simulate_sampling_dist <- function(n, distribution = "norm",
                                  params = list(), n_sim = 1500) {
  # Generate data based on distribution
  if (distribution == "norm") {
    data.vec <- rnorm(n * n_sim, mean = params$mean, sd = params$sd)
  } else if (distribution == "gamma") {
    data.vec <- rgamma(n * n_sim, shape = params$alpha,
                      scale = params$beta)
  }
  # Add more distributions as needed

  # Organize into samples
  data.mat <- matrix(data.vec, nrow = n_sim)

  # Calculate sample means
  sample_means <- apply(data.mat, 1, mean)

  # Return results
  return(sample_means)
}

# Use the function
means <- simulate_sampling_dist(n = 10, distribution = "norm",
                               params = list(mean = 0, sd = 1))

Example: Comparing Multiple Sample Sizes

library(gridExtra)

# Create plots for different n values
n_values <- c(5, 10, 20, 50)
plot_list <- list()

for (i in seq_along(n_values)) {
  n <- n_values[i]

  # Generate sampling distribution
  data.vec <- rnorm(n * 1500, mean = 0, sd = 1)
  data.mat <- matrix(data.vec, nrow = 1500)
  sample_means <- apply(data.mat, 1, mean)

  # Create plot
  p <- ggplot(data.frame(means = sample_means), aes(x = means)) +
    geom_histogram(aes(y = after_stat(density)), bins = 30,
                  fill = "grey", col = "black") +
    stat_function(fun = dnorm,
                 args = list(mean = 0, sd = 1/sqrt(n)),
                 col = "blue", linewidth = 1) +
    ggtitle(paste("n =", n)) +
    theme_minimal()

  plot_list[[i]] <- p
}

# Display all plots
grid.arrange(grobs = plot_list, ncol = 2)

Part 1: Exploring CLT with Skewed Distributions

In this exercise, you will investigate how the Central Limit Theorem applies when sampling from a skewed population. The goal is to observe how the distributions of the sample mean and sample sum evolve as the sample size increases.

The Gamma Distribution

The Gamma distribution is defined by two parameters:

\(\alpha\) (shape parameter)
\(\beta\) (scale parameter)

If \(X \sim \text{Gamma}(\alpha, \beta)\) then:

Mean: \(E[X] = \alpha \cdot \beta\)
Variance: \(\text{Var}(X) = \alpha \cdot \beta^2\)
Standard Deviation: \(\sigma = \sqrt{\alpha} \cdot \beta\)

Question 1: For this experiment, use a right-skewed Gamma distribution with \(X \sim \text{Gamma}(\alpha = 0.5, \beta = 5)\).

Sampling Distribution of the Sample Mean

Generate 1500 samples from \(X \sim \text{Gamma}(\alpha = 0.5, \beta = 5)\) for different sample sizes \(n\), starting with \(n = 5\).

Compute the sample mean for each sample.

Create histograms of the sample means for each \(n\).

Overlay a Normal density curve using the CLT approximation:

\[\bar{X} \overset{\sim}{\approx} N\left(\mu, \frac{\sigma}{\sqrt{n}}\right)\]

where \(\mu = \alpha \cdot \beta\) and \(\sigma = \sqrt{\alpha} \cdot \beta\).

Repeat these steps increasing \(n\) by units of 5 until you believe the approximation is reasonable.

Guidance for Part 1a

What to modify from the tutorial:

In Step 2, replace rnorm() with rgamma(SRS * n, shape = 0.5, scale = 5)
Calculate theoretical parameters.
Test multiple values of n (start with 5, then try 10, 15, 20, etc.)
For each n, create the histogram and Q-Q plot to assess normality
Look for when the distribution becomes symmetric and bell-shaped

Warning

Remember that the standard deviation of \(\bar{X}\) is \(\sigma/\sqrt{n}\), not \(\sigma\). The distribution becomes more concentrated as \(n\) increases.

Answer the following questions:
1. How does increasing \(n\) affect the shape of the sampling distribution?
2. For which \(n\) does the sampling distribution for the sample mean appear approximately Normal?
Sampling Distribution of the Sample Sum

Repeat the above experiment but for the sampling distribution of the sample sum \(S_n = X_1 + X_2 + \cdots + X_n\).

Guidance for Part 1c

What to modify from the tutorial:

In Step 4, replace apply(data.mat, 1, mean) with apply(data.mat, 1, sum)
Calculate theoretical parameters for the sum.
Notice that both the mean AND standard deviation increase with \(n\) for the sum
Create histograms showing how the sum’s distribution changes with \(n\)

Note

For the sample sum, the CLT tells us that \(S_n \overset{\sim}{\approx} N(n \cdot \mu, \sigma \cdot \sqrt{n})\). Notice that both the mean AND the standard deviation increase with \(n\).

Answer the following questions:
1. How does increasing \(n\) affect the shape of the sampling distribution of \(S_n\)?
2. For which \(n\) does the sampling distribution for the sample sum appear approximately Normal?

Important Distinction

Disclaimer: The Central Limit Theorem relies on certain conditions, specifically that the population has a finite mean and finite variance. The Gamma distribution used in this exercise satisfies these conditions, allowing the sample mean and sum to eventually follow a Normal distribution as \(n\) increases.

However, for highly skewed distributions, the required \(n\) for reasonable Normal approximation can be quite large. Additionally, the CLT does not hold for distributions like the Cauchy distribution, which lacks both a finite mean and finite variance. As an exercise, try repeating this experiment using the Cauchy distribution and observe whether the sample mean and sum approach Normality as \(n\) increases.

Note

For discrete distributions, the CLT still applies, but when approximating a discrete sampling distribution with a continuous Normal distribution, a continuity correction is sometimes necessary for greater accuracy. The correction involves adjusting probability bounds by 0.5 in either direction when using the Normal approximation. While not explored in this worksheet, it’s an important consideration when applying the CLT to discrete data.

Part 2: Application of the CLT

Question 2: A medical company is developing an implantable drug delivery device that gradually releases medication into a patient’s bloodstream. The company is testing how long these devices function before they fail due to mechanical wear and depletion of the drug reservoir.

Based on historical data, the time until failure (in months) follows a Gamma distribution:

\[X \sim \text{Gamma}(\alpha = 3, \beta = 6)\]

To ensure patient safety, the company wants to assess the probability that a group of implants meets certain performance standards.

A study collects a random sample of \(n = 45\) implants and records the time each device functions before failure.
1. Conduct a quick check through simulation to determine whether the Normal approximation is reasonable for the distribution of \(\bar{X}\) with \(n = 45\). (Just change your settings from your previous simulations.)
2. Determine the approximate distribution of \(\bar{X}\) using the CLT, including the parameters of the distribution.
3. Compute the probability that the average life span of a sample of 45 implants exceeds 19 months.
4. Check that these results hold empirically. Obtain 5000 SRS of size \(n = 45\) from the \(\text{Gamma}(\alpha = 3, \beta = 6)\) distribution and compute 5000 sample means \(\bar{x}_1, \bar{x}_2, \ldots, \bar{x}_{5000}\). Determine what proportion of the sample means exceeds 19. Was this proportion close to your answer using the CLT approximation?

Guidance for Part 2a(iv)

What to modify from the tutorial:

Change SRS <- 1500 to SRS <- 5000
Use rgamma(5000 * 45, shape = 3, scale = 6)
After calculating sample means, use sum(sample_means > 19) / 5000 to find the proportion
Compare this empirical proportion to your theoretical answer from part (iii)

Find the 90th percentile for the sample mean failure time. That is, determine \(x_{0.9}\) such that:

\[P(\bar{X} \leq x_{0.9}) = 0.9\]

Part 3: Beyond Mean and Sum

Question 3: In the previous exercises, you examined how the Central Limit Theorem applies to the sample mean and sample sum. In this problem, you will investigate whether the CLT also applies to the sample variance \(S^2\).

Generate samples from \(X \sim \text{Gamma}(\alpha = 0.5, \beta = 5)\) for different values of \(n\).
1. Generate 1500 samples for different sizes of \(n\) starting with \(n = 2\).
2. Compute the sample variance \(S^2\) for each sample.
3. Create histograms and overlay Normal density curves.
4. Repeat these steps increasing \(n\) by units of 50 and report what you observe.

Guidance for Part 3a

What to modify from the tutorial:

In Step 4, replace apply(data.mat, 1, mean) with apply(data.mat, 1, var)
Note: Theoretical mean of \(S^2\) is \(\sigma^2 = \alpha \cdot \beta^2\)
Test values: \(n = 2, 50, 100, 150, 200, 250\)
Observe how many samples are needed before the distribution looks Normal

Does the sample variance appear to follow a Normal distribution as \(n\) increases? How does the required \(n\) for Normality compare to that of the sample mean?

Key Insight

While the Central Limit Theorem applies to many statistics beyond the sample mean, the rate of convergence varies significantly. The sample variance requires a much larger sample size to achieve approximate normality compared to the sample mean.

This is why statistical inference for variance often uses different distributions (like the chi-square distribution) rather than relying solely on the Normal approximation from CLT.

Important: Avoid Simple Rules of Thumb

You may encounter rules like “n ≥ 30 is sufficient for CLT” or “CLT works for any n > 50.” These oversimplifications can be dangerously misleading. The required sample size depends heavily on:

Population shape: Highly skewed distributions (like our Gamma(0.5, 5)) require much larger n than symmetric distributions
The statistic being studied: Variance converges far slower than the mean
Tail behavior: Heavy-tailed distributions need larger samples
Practical requirements: How accurate does your Normal approximation need to be?

Better Approach: Use Domain Knowledge and Simulation

Instead of blindly applying rules of thumb:

Apply domain knowledge: What do you know about the phenomenon you’re studying? Income data tends to be right-skewed. Human heights are roughly symmetric. Waiting times are often exponentially distributed. Rare events have heavy tails. Use this contextual understanding to anticipate whether you’ll need larger or smaller samples.
Consider the statistic: Is it a smooth function of means (fast convergence) or does it depend on extreme values (slow convergence)?
Think about your application: Does your problem require accuracy in the tails of the distribution, or just the center?
When in doubt, simulate: Run the simulation process you learned in this worksheet to empirically check whether the Normal approximation is reasonable for your specific situation. The code you wrote takes minutes to run and gives empirical evidence specific to your exact problem.

Developing this judgment through domain knowledge and simulation is far more valuable than memorizing arbitrary cutoffs. Statistical practice requires thoughtful analysis, not mechanical rule-following.

Part 4: Exploring CLT Generalizability with AI Assistance (Exploration)

Question 4: In this worksheet, you’ve explored how the CLT applies to the sample mean, sample sum, and sample variance. But what about other statistics? Does the CLT apply to the sample median? The sample maximum? The range?

In this exercise, you’ll use an AI assistant (like ChatGPT, Claude, or others) to investigate the general conditions under which the CLT applies to different sample statistics. This is an independent exploration activity - you won’t submit written answers, but you should run simulations and take notes for your own understanding.

Learning to Ask Good Questions

Working effectively with AI requires learning to craft precise, well-structured prompts. This skill is increasingly important in modern statistical practice. You’ll develop a multi-stage prompt strategy to deeply understand a statistical concept.

Step 1: Initial Exploration Prompt

Start with a broad question to understand the landscape:

Prompt 1:
"I'm learning about the Central Limit Theorem. I've already studied how it
applies to sample means, sums, and variance - I know the mean converges
fastest, and variance requires much larger sample sizes.

Does the CLT apply to other sample statistics like the median, maximum,
minimum, or range? Please provide a general overview of which statistics
have CLT-like behavior and which do not."

After receiving the response, read it carefully and consider:

Which statistics does the AI say have CLT-like behavior?
Which statistics does the AI say do NOT have standard CLT behavior?
Are there any terms or concepts mentioned that you don’t understand?

Step 2: Clarification and Deepening Prompt

Now ask for clarification on specific points:

Prompt 2:
"You mentioned that [specific statistic from the response] has/doesn't have
CLT-like behavior. Can you explain:
1. What are the mathematical conditions required for a statistic to satisfy
   the CLT?
2. Why specifically does [the statistic] meet or fail these conditions?
3. How does its convergence rate compare to the sample mean and variance?

Please include the technical terms but also provide intuitive explanations."

Pay attention to the mathematical conditions the AI identifies and why they matter.

Step 3: Verification Through Simulation Prompt

Now ask the AI to help you verify its claims:

Prompt 3:
"Help me design an R simulation to test whether the sample median from a
Gamma(0.5, 5) distribution follows a Normal distribution as sample size
increases. I already know how to simulate sample means using the matrix
and apply() approach from my tutorial. What modifications do I need to
make to test the median, and what should I expect to see?"

Based on the AI’s response, modify your tutorial code to test the sample median:

# Your simulation code here
# Test sample sizes: n = 5, 10, 20, 50, 100, 200

# Create histograms and Q-Q plots
# Compare convergence to what you saw with mean and variance

Step 4: Critical Analysis Prompt

Finally, critically evaluate what you’ve learned:

Prompt 4:
"I ran simulations for the sample median with n = 5, 10, 20, 50, 100, 200
from a Gamma(0.5, 5) distribution. [Describe what you observed - does it
look Normal? How large does n need to be?]

How does this compare to what you predicted? Where would the median rank
in terms of convergence speed compared to the mean and variance I already
studied? Are there any nuances about the median's convergence that I
should understand?"

Consider in your notes:

Does the sample median satisfy the CLT based on your simulations?
How do you rank convergence rates: sample mean, sample variance, and sample median?
What sample size is needed for the median to appear approximately Normal?
What does “asymptotic normality” mean and how does it relate to CLT?

Step 5: Synthesis and Generalization

Prompt 5:
"Based on our discussion, can you provide a general principle: What
characteristics make a sample statistic 'well-behaved' under the CLT
(fast convergence like the mean) versus 'poorly-behaved' (slow convergence
like variance, or no convergence at all)? Use specific examples to
illustrate, and explain where median-like statistics fit in this spectrum."

Synthesize what you’ve learned:

What makes a statistic converge quickly to normality?
What makes a statistic converge slowly or not at all?
How did your understanding evolve across the five prompts?
Did the AI make any claims inconsistent with your simulations?

Critical AI Literacy

Important Reminders:

AI assistants can make mistakes, especially with technical details
Always verify AI claims through simulation, mathematical derivation, or authoritative sources
Different AI systems may give different answers - this doesn’t mean one is “correct”
The goal is not to get “the answer” from AI, but to use AI as a tool for guided exploration
Your simulations and mathematical reasoning are the ultimate arbiters of truth

Extension Challenge

Try this same five-step process with another statistic you haven’t explored yet:

Sample standard deviation
Sample range (max - min)
Sample interquartile range
75th percentile
Sample minimum or maximum

Compare convergence rates across all statistics you’ve explored. Can you develop a general ranking from fastest to slowest convergence?

Key Takeaways

Summary

The Central Limit Theorem states that sample means and sums approach Normal distributions as \(n\) increases, regardless of the population distribution (provided it has finite mean and variance)
Simulation methodology follows four key steps: (1) Generate SRS × n observations, (2) Organize into matrix with nrow = SRS, (3) Apply mean/sum function to rows, (4) Analyze resulting sampling distribution
The rate of convergence depends critically on the population’s shape - symmetric distributions converge quickly, while highly skewed distributions require much larger sample sizes
For sample means: \(\bar{X} \approx N\left(\mu, \frac{\sigma}{\sqrt{n}}\right)\) for large \(n\)
For sample sums: \(S_n \approx N(n\mu, \sigma\sqrt{n})\) for large \(n\)
The CLT requires finite mean and variance - it fails for heavy-tailed distributions like the Cauchy distribution
Different statistics (mean, variance, median, range) have vastly different convergence rates - the sample mean typically converges fastest
Simulation is an invaluable tool for verifying when CLT approximations are appropriate before applying them to real problems
The CLT is the foundation of statistical inference - it justifies confidence intervals, hypothesis tests, and many other procedures