This document outlines the objectives for Exam 1, covering the essential concepts from chapters 1 through 6.
Define and demonstrate knowledge of the three branches of statistics:
Define and distinguish between a population and a sample including their respective symbols; population parameters by Greek letters, sample statistics are denoted by Latin letters.
Determine whether a listing of objects refers to a population or a sample.
Identify situations that exemplify probability or inferential statistics.
Identify data as univariate, bivariate, or multivariate.
Recognize and classify variables as categorical/qualitative or numerical/quantitative.
Describe the shape of a distribution:
Interpret histograms to describe shape and identification of outliers.
Given R output, identify the statistics: mean, median, variance, standard deviation, and quartiles.
Understand and state the formulas for sample mean and sample variance:
\[\bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i\]
\[s^2 = \frac{1}{n-1}\sum_{i=1}^{n}(x_i - \bar{x})^2\]
\[\text{Standard deviation} = \sqrt{\text{variance}}\]
Calculate the Interquartile Range (IQR) and explain quartiles in non-mathematical terms. \[\text{IQR}=Q_3-Q_1\]
Write down the five-number summary from R output and interpret modified boxplots.
Using the five number summary identify inner and outer fences. \[\text{IF}_{\text{L}}=Q_1-1.5\times\text{IQR},~~\text{IF}_{\text{H}}=Q_3+1.5\times\text{IQR}\] \[\text{OF}_{\text{L}}=Q_1-3\times\text{IQR},~~\text{OF}_{\text{H}}=Q_3+3\times\text{IQR}\]
Identify explicit points using the 1.5 IQR rule and evaluate if they are “real”.
Draw/complete a modified boxplot from the five number summary and 1.5 IQR rule.
Interpret the results of a modified boxplot or side-by-side boxplots.
Decide on the appropriate measures of location and spread for given data.
Robust vs non-robust summaries:
Write down the sample space for experiments and determine disjoint events.
Understand the frequentist interpretation of probability: \[\lim_{n \to \infty} \frac{n(E)}{n} \approx P(E)\]
State and check the axioms associated with a probability space \(\Omega\): \[\text{For any event E}\subseteq\Omega,~~~~ 0 \leq P(E) \leq 1\] \[P(\Omega) = 1\] \[\text{For any event E}\subseteq\Omega,P(E)=\sum_{\omega\in \text{E}}P(\omega)\]
Calculate a probability using:
Use Venn diagrams to visualize and calculate probabilities.
Calculate probabilities using probability rules:
Independence
Bayes’ Rule:
Baye’s Rule for 2 Events: \[P(A|B) = \frac{P(B|A)P(A)}{P(B | A)P(A)+P(B | A')P(A')}\]
General Baye’s Rule for \(n\) Events: If \(A_1, ..., A_n\) are exhaustive and mutually exclusive events \[P(A_i|B) = \frac{P(B|A_i)P(A_i)}{\sum_{i=1}^{n} P(B | A_i)P(A_i)}\]
Recognize the properties of a valid probability distribution for discrete variables:
Calculate probabilities using a probability mass function (pmf).
Calculate the mean of a discrete random variable (Expected value): \[\text{E}(X) = \mu_X = \sum x \cdot p(x)\]
CDF for a discrete r.v. and interval use:
\[
F_X(x)=P(X\le x)=\sum_{t\le x} p_X(t),\qquad
P(a<X\le b)=F_X(b)-F_X(a).
\]
Calculate the variance and standard deviation for a discrete random variable:
Variance: \[\text{Var}(X) = \sigma_X^2 = \text{E}[(X - \mu_X)^2] = \text{E}(X^2) - [\text{E}(X)]^2\]
Standard deviation: \[\sigma_X = \sqrt{\text{Var}(X)}\]
LOTUS For any real valued function \(g(\cdot)\) and discrete random variable \(X\) \[\text{E}[g(X)]=\sum_x g(x)p_X(x)\]
Linearity of Expectation: For any two random variables \(X\) and \(Y\), and constants \(a\) and \(b\), \[ \text{E}(aX \pm bY) = a\text{E}(X) \pm b\text{E}(Y) \]
Variance of a Linear function: For any random variable \(X\) and constants \(a\neq 0\) and \(b\), \[ \text{Var}(aX + b) = a^2\text{Var}(X) \] This shows that adding a constant \(b\) to a random variable does not change its variance, while multiplying by \(a\) scales the variance by \(a^2\).
Variance of the Sum/Difference of Two Independent Random Variables: If \(X\) and \(Y\) are independent, \[ \text{Var}(X \pm Y) = \text{Var}(X) + \text{Var}(Y) \]
For a Binomial distribution, understand when it applies (BInS criteria) and how to calculate probabilities, expected values, and variances: \[P(X = x) = \binom{n}{x} p^x (1-p)^{n-x}\] \[\text{E}(X) = np, ~~~\sigma_X = \sqrt{np(1-p)}\]
For a Poisson distribution, recognize when it applies and how to calculate probabilities, expected values, and variances: \[P(X = x) = \frac{e^{-\lambda} \lambda^x}{x!}\] \[\text{E}(X) = \lambda, ~~\sigma_X = \sqrt{\lambda}\]
Determine if a function is a legitimate density function and calculate the normalization constant if necessary.
Calculate probabilities for a continuous random variable using the density function: \[P(a < X < b) = \int_a^b f(x)dx\]
Calculate and use the cumulative distribution function (CDF): \[F(x) = P(X \leq x) = \int_{-\infty}^x f(t)dt\]
Piecewise PDFs/CDFs: split integrals at all boundaries and carry forward accumulated area when building \(F_X\).
Percentiles and Median:
Mean (Expected Value): \[E(X) = \mu_X = \int_{-\infty}^{\infty} x \cdot f(x)dx\]
LOTUS (continuous):
\[
E[g(X)]=\int_{-\infty}^{\infty} g(x)\,f_X(x)\,dx.
\]
Variance (continuous) and shortcut:
\[
\operatorname{Var}(X)=\int (x-\mu)^2 f_X(x)\,dx
=E[X^2]-\mu^2.
\]
(Note: The distributions have been written in short hand notation. You need to realize where the pdf/cdf is 0 and where the cdf is 1.)