This document outlines the objectives for Exam 1, covering the essential concepts from chapters 1 through 6.
Define and demonstrate knowledge of the three branches of statistics:
Define and distinguish between a population and a sample including their respective symbols; population parameters by Greek letters, sample statistics are denoted by Latin letters.
Determine whether a listing of objects refers to a population or a sample.
Identify situations that exemplify probability or inferential statistics.
Identify data as univariate, bivariate, or multivariate.
Recognize and classify variables as categorical/qualitative or numerical/quantitative.
Describe the shape of a distribution:
Interpret histograms to describe shape and identification of outliers.
Given R output, identify the statistics: mean, median, variance, standard deviation, and quartiles.
Understand and state the formulas for sample mean and sample variance:
\[\bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i\]
\[s^2 = \frac{1}{n-1}\sum_{i=1}^{n}(x_i - \bar{x})^2\]
\[\text{Standard deviation} = \sqrt{\text{variance}}\]
Calculate the Interquartile Range (IQR) and explain quartiles in non-mathematical terms. \[\text{IQR}=Q_3-Q_1\]
Write down the five-number summary from R output and interpret modified boxplots.
Using the five number summary identify inner and outer fences. \[\text{IF}_{\text{L}}=Q_1-1.5\times\text{IQR},~~\text{IF}_{\text{H}}=Q_3+1.5\times\text{IQR}\] \[\text{OF}_{\text{L}}=Q_1-3\times\text{IQR},~~\text{OF}_{\text{H}}=Q_3+3\times\text{IQR}\]
Identify explicit points using the 1.5 IQR rule and evaluate if they are “real”.
Draw/complete a modified boxplot from the five number summary and 1.5 IQR rule.
Interpret the results of a modified boxplot or side-by-side boxplots.
Decide on the appropriate measures of location and spread for given data.
Write down the sample space for experiments and determine disjoint events.
Understand the frequentist interpretation of probability: \[\lim_{n \to \infty} \frac{n(E)}{n} \approx P(E)\]
State and check the axioms associated with a probability space \(\Omega\): \[\text{For any event E}\subseteq\Omega,~~~~ 0 \leq P(E) \leq 1\] \[P(\Omega) = 1\] \[\text{For any event E}\subseteq\Omega,P(E)=\sum_{\omega\in \text{E}}P(\omega)\]
Calculate a probability using:
Use Venn diagrams to visualize and calculate probabilities.
Calculate probabilities using probability rules:
Independence
Bayes’ Rule:
Baye’s Rule for 2 Events: \[P(A|B) = \frac{P(B|A)P(A)}{P(B | A)P(A)+P(B | A')P(A')}\]
General Baye’s Rule for \(n\) Events: If \(A_1, ..., A_n\) are exhaustive and mutually exclusive events \[P(A_i|B) = \frac{P(B|A_i)P(A_i)}{\sum_{i=1}^{n} P(B | A_i)P(A_i)}\]
Recognize the properties of a valid probability distribution for discrete variables:
Calculate probabilities using a probability mass function (pmf).
Calculate the mean of a discrete random variable (Expected value): \[\text{E}(X) = \mu_X = \sum x \cdot p(x)\]
Calculate the variance and standard deviation for a discrete random variable:
Variance: \[\text{Var}(X) = \sigma_X^2 = \text{E}[(X - \mu_X)^2] = \text{E}(X^2) - [\text{E}(X)]^2\]
Standard deviation: \[\sigma_X = \sqrt{\text{Var}(X)}\]
LOTUS For any real valued function \(g(\cdot)\) and discrete random variable \(X\) \[\text{E}[g(X)]=\sum_x g(x)p_X(x)\]
Linearity of Expectation: For any two random variables \(X\) and \(Y\), and constants \(a\) and \(b\), \[ \text{E}(aX \pm bY) = a\text{E}(X) \pm b\text{E}(Y) \]
Variance of a Linear function: For any random variable \(X\) and constants \(a\neq 0\) and \(b\), \[ \text{Var}(aX + b) = a^2\text{Var}(X) \] This shows that adding a constant \(b\) to a random variable does not change its variance, while multiplying by \(a\) scales the variance by \(a^2\).
Variance of the Sum/Difference of Two Independent Random Variables: If \(X\) and \(Y\) are independent, \[ \text{Var}(X \pm Y) = \text{Var}(X) + \text{Var}(Y) \]
For a Binomial distribution, understand when it applies (BInS criteria) and how to calculate probabilities, expected values, and variances: \[P(X = x) = \binom{n}{x} p^x (1-p)^{n-x}\] \[\text{E}(X) = np, ~~~\sigma_X = \sqrt{np(1-p)}\]
For a Poisson distribution, recognize when it applies and how to calculate probabilities, expected values, and variances: \[P(X = x) = \frac{e^{-\lambda} \lambda^x}{x!}\] \[\text{E}(X) = \lambda, ~~\sigma_X = \sqrt{\lambda}\]
Determine if a function is a legitimate density function and calculate the normalization constant if necessary.
Calculate probabilities for a continuous random variable using the density function: \[P(a < X < b) = \int_a^b f(x)dx\]
Calculate and use the cumulative distribution function (CDF): \[F(x) = P(X \leq x) = \int_{-\infty}^x f(t)dt\]
Percentiles and Median:
(Note: The distributions have been written in short hand notation. You need to realize where the pdf/cdf is 0 and where the cdf is 1.)