.. _final-spring2026:

Final Exam — Spring 2026: Worked Solutions
===========================================

.. admonition:: Exam Information
   :class: info

   | **Course:** STAT 350 — Introduction to Statistics
   | **Semester:** Spring 2026
   | **Total Points:** 150 + 15 (Extra Credit) = 165
   | **Time Allowed:** 120 minutes
   | **Coverage:** Cumulative (Chapters 1–13); primary emphasis on Chapters 12–13, with Chapters 1–7 weighted more heavily than Chapters 8–11 among the earlier material

   .. list-table::
      :header-rows: 1
      :widths: 50 25 25

      * - Problem
        - Total Possible
        - Topic
      * - Problem 1 (True/False, 2 pts each)
        - 20
        - Linear Transformations, Independence, Poisson, Uniform, PDF Interpretation, ANOVA Methods, F-test, Regression Units, CI Duality, Normality Assumption
      * - Problem 2 (Multiple Choice, 3 pts each)
        - 18
        - Venn Diagrams, Exponential Memoryless, Conditional Normal, Bonferroni, ANOVA Assumptions, Scatter Plots/Correlation
      * - Problem 3
        - 24
        - Piecewise PDF/CDF, Variance of Transformation
      * - Problem 4
        - 26
        - Total Probability, Bayes' Rule, Binomial Distribution
      * - Problem 5
        - 40
        - One-Way ANOVA, Tukey HSD
      * - Problem 6
        - 37
        - Simple Linear Regression, LINE Assumptions, Prediction, Confidence Interval
      * - **Total**
        - **150** (+ 15 Extra Credit)
        -

   | `Exam PDF <https://treese41528.github.io/STAT350/Exam_Resources/Final/Final_Exam_SPRING_2026.pdf>`__
   | `Solution Key PDF <https://treese41528.github.io/STAT350/Exam_Resources/Final/Final_Exam_SPRING_2026_Solution.pdf>`__

---

Problem 1 — True/False (20 points, 2 pts each)
-----------------------------------------------

.. admonition:: Question 1.1  (2 pts)
   :class: note

   A sensor records temperatures in Celsius. A data analyst converts every observation to Fahrenheit using :math:`F = 1.8C + 32`.

   The sample standard deviation of the Fahrenheit data is exactly 1.8 times the sample standard deviation of the Celsius data.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: TRUE**

      For a linear transformation :math:`Y = aX + b`, the standard deviation transforms as :math:`s_Y = |a|\,s_X`. Here :math:`a = 1.8` and :math:`b = 32`, so :math:`s_F = 1.8\,s_C`. The additive constant does not affect spread.

.. admonition:: Question 1.2  (2 pts)
   :class: note

   A mechanical engineer classifies each manufactured part as exactly one of three grades: :math:`A` (premium), :math:`B` (standard), or :math:`C` (substandard). The historical probability for the classification to each grade are :math:`P(A) = 0.3`, :math:`P(B) = 0.6`, and :math:`P(C) = 0.1`.

   Events :math:`A` and :math:`C` are **dependent**.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: TRUE**

      Since each part receives exactly one grade, :math:`A` and :math:`C` are mutually exclusive: :math:`P(A \cap C) = 0`. For independence we would need :math:`P(A \cap C) = P(A)\,P(C) = 0.3 \times 0.1 = 0.03 \neq 0`. Because the product rule fails, :math:`A` and :math:`C` are dependent.

.. admonition:: Question 1.3  (2 pts)
   :class: note

   During an NFL season, a sports analytics team tracks all reported injuries sustained by a single team per game quarter, including those not immediately apparent to viewers such as minor strains and aggravations recorded on the official injury report. It has been historically observed that the team averages approximately 1.4 reported injuries per quarter across all games. However, detailed records reveal that injury rates in the 4th quarter are consistently higher than in the 1st quarter, as cumulative fatigue increases injury risk throughout a game.

   A single :math:`\text{Poisson}(\lambda = 1.4)` model applied uniformly across all four quarters would **violate** an assumption of the Poisson process.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: TRUE**

      A Poisson process requires a **constant rate** :math:`\lambda` over the interval of interest. If the injury rate is higher in the 4th quarter than in the 1st quarter, the rate is not constant across quarters. Applying a single :math:`\lambda = 1.4` uniformly ignores this non-constant rate and violates the homogeneity assumption.

.. admonition:: Question 1.4  (2 pts)
   :class: note

   Two CNC machines produce bolts whose diameters (in mm) follow continuous uniform distributions. Machine A produces bolts with diameters following :math:`\text{Uniform}(9.5, 10.5)` and Machine B produces bolts with diameters following :math:`\text{Uniform}(9.8, 10.8)`.

   The probability that Machine A produces a bolt with a diameter between 9.8 and 10.0 is the same as the probability that Machine B does.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: TRUE**

      For a :math:`\text{Uniform}(a, b)` distribution, :math:`P(c < X < d) = (d - c)/(b - a)` when :math:`[c, d] \subseteq [a, b]`.

      .. math::

         P(9.8 < X_A < 10.0) = \frac{10.0 - 9.8}{10.5 - 9.5} = \frac{0.2}{1.0} = 0.2

      .. math::

         P(9.8 < X_B < 10.0) = \frac{10.0 - 9.8}{10.8 - 9.8} = \frac{0.2}{1.0} = 0.2

      Both probabilities equal 0.2.

.. admonition:: Question 1.5  (2 pts)
   :class: note

   A quality engineer models the lifespan of a sensor component using a continuous distribution. She computes :math:`f_X(500) = 0.003`, where :math:`f_X` is the PDF of the lifespan in hours.

   Therefore the engineer is certain that a lifespan of 500 hours is very rare.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: FALSE**

      The value :math:`f_X(500) = 0.003` is a **probability density**, not a probability. For a continuous random variable, :math:`P(X = 500) = 0` regardless of the PDF value. Density values can exceed 1 or be very small without directly indicating how "rare" a specific value is. Only the **area under the PDF** (an integral) gives a probability.

.. admonition:: Question 1.6  (2 pts)
   :class: note

   A **one-way ANOVA** with **5 groups** produces a **significant** :math:`F`-**test**.

   If the researcher wants to compare all 10 possible pairs of means, Dunnett's method is more appropriate than Tukey's method.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: FALSE**

      Dunnett's method is designed for comparing each treatment group to a **single control group**, not for all pairwise comparisons. With 5 groups and :math:`\binom{5}{2} = 10` pairwise comparisons, **Tukey's HSD** is the appropriate method. Dunnett's would only cover :math:`k - 1 = 4` comparisons (each treatment vs. the control).

.. admonition:: Question 1.7  (2 pts)
   :class: note

   In a **one-way ANOVA**, if the **between-group variability** is large relative to the **within-group variability**,

   then the :math:`F`-test statistic will tend to be large, giving more evidence against the null hypothesis that all population means are equal.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: TRUE**

      The ANOVA :math:`F`-statistic is :math:`F = \text{MSA}/\text{MSE}`, where MSA captures between-group variability and MSE captures within-group variability. When between-group variability is large relative to within-group variability, :math:`F` is large, pushing the :math:`p`-value toward zero and providing stronger evidence against :math:`H_0`.

.. admonition:: Question 1.8  (2 pts)
   :class: note

   A biostatistician plans to fit a simple linear regression line to predict male adults' height using their femur bone length. Both variables are measured in inches in the original data. Before fitting a regression line, the unit has changed to millimeters for universal applications in medical fields.

   The :math:`p`-value of a regression slope remains constant even after the unit change.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: TRUE**

      Both :math:`X` (femur length) and :math:`Y` (height) are converted by the same factor: :math:`1\;\text{inch} = 25.4\;\text{mm}`. The slope transforms as:

      .. math::

         b_{1,\text{mm}} = \frac{\Delta Y_{\text{mm}}}{\Delta X_{\text{mm}}} = \frac{25.4\,\Delta Y_{\text{in}}}{25.4\,\Delta X_{\text{in}}} = b_{1,\text{in}}

      Since the slope estimate is unchanged, its standard error, :math:`t`-statistic, and :math:`p`-value are all unchanged.

.. admonition:: Question 1.9  (2 pts)
   :class: note

   Suppose that :math:`(-3, 4)` is a **95% confidence interval** for :math:`\beta_0` in a simple linear regression. For some constant :math:`c`, we perform hypothesis testing :math:`H_a\colon \beta_0 \neq c` at :math:`\alpha = 0.05`.

   We reject the null hypothesis if :math:`c` is within the confidence interval.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: FALSE**

      The CI–test duality says: at the same confidence level and significance level, we **fail to reject** :math:`H_0\colon \beta_0 = c` when :math:`c` falls **inside** the confidence interval. We **reject** :math:`H_0` when :math:`c` falls **outside** the interval. The statement reverses the direction.

.. admonition:: Question 1.10  (2 pts)
   :class: note

   A materials engineer fits a simple linear regression to predict tensile strength (:math:`Y`) from carbon content (:math:`X`) in steel alloys. Before checking residuals, the engineer examines the distribution of :math:`Y` values alone and finds them to be strongly right-skewed.

   The normality assumption of the simple linear regression model is violated.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: FALSE**

      The normality assumption in simple linear regression applies to the **error terms** (equivalently, the residuals), not to the marginal distribution of :math:`Y`. Even if :math:`Y` is right-skewed marginally, the conditional distribution :math:`Y \mid X = x` (and therefore the errors) can still be normal. The skewness in :math:`Y` may simply reflect the distribution of :math:`X` values. Normality must be assessed from a residual plot or normal Q–Q plot of residuals, not from the marginal distribution of :math:`Y`.

---

Problem 2 — Multiple Choice (18 points, 3 pts each)
----------------------------------------------------

.. admonition:: Question 2.1  (3 pts)
   :class: note

   A software quality team reviews 200 applications for two categories of defects before deployment. Let :math:`M = \{\text{has memory leak issues}\}` and :math:`B = \{\text{has performance bottleneck issues}\}`.

   The Venn diagram shows the number of applications in each region.

   .. figure:: https://yjjpfnblgtrogqvcjaon.supabase.co/storage/v1/object/public/stat-350-assets/images/Exams/FinalExam/SPRING2026/venn_q2_1.png
      :alt: Venn diagram of software defect classification (n = 200). M (Memory leaks) circle contains 40 in M-only, 20 in the intersection with B. B (Bottlenecks) circle contains 30 in B-only. Outside both circles: 110 (Neither).
      :align: center
      :width: 60%

   Which of the following counts is computed correctly? (:math:`|\cdot|` denotes size of set)

   - (A) :math:`|M' \cap B'| = 110`
   - (B) :math:`|M' \cap B| = 180`
   - (C) :math:`|M \cap B'| = 30`
   - (D) :math:`|M \cup B'| = 150`

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: (A)**

      From the Venn diagram: :math:`|M| = 40 + 20 = 60`, :math:`|B| = 20 + 30 = 50`, :math:`|M \cap B| = 20`, neither :math:`= 110`.

      - **(A)** :math:`|M' \cap B'|` = applications with neither defect :math:`= 110`. **CORRECT.**
      - **(B)** :math:`|M' \cap B|` = applications with bottleneck only :math:`= 30 \neq 180`.
      - **(C)** :math:`|M \cap B'|` = applications with memory leak only :math:`= 40 \neq 30`.
      - **(D)** :math:`|M \cup B'| = |M| + |B'| - |M \cap B'| = 60 + 150 - 40 = 170 \neq 150`.

.. admonition:: Question 2.2  (3 pts)
   :class: note

   A semiconductor fabrication line experiences random equipment faults. The time (in hours) between faults follows an Exponential distribution with rate :math:`\lambda = 0.5` per hour. The line has been running fault-free for at least 6 hours.

   What is the probability it continues to run fault-free for at least two more hours?

   - (A) 0.0183
   - (B) 0.0498
   - (C) 0.1353
   - (D) 0.3679
   - (E) 0.6321

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: (D)**

      By the **memoryless property** of the Exponential distribution:

      .. math::

         P(T > 8 \mid T > 6) = P(T > 2) = e^{-0.5 \times 2} = e^{-1} \approx 0.3679

      .. code-block:: r

         pexp(2, rate = 0.5, lower.tail = FALSE)
         # [1] 0.3679

.. admonition:: Question 2.3  (3 pts)
   :class: note

   The response time (in milliseconds) of a web application follows a Normal distribution with :math:`\mu = 250` and :math:`\sigma = 20`. A request is classified as "slow" if it takes more than 230 ms.

   Given that a request is slow, what is the probability it takes more than 280 ms? Fractions are shown for readability. In R, these would be written using the ``/`` operator.

   - (A) ``pnorm(280, mean = 250, sd = 20, lower.tail = FALSE)``
   - (B) ``pnorm(280, mean = 250, sd = 20, lower.tail = FALSE) / pnorm(230, mean = 250, sd = 20, lower.tail = TRUE)``
   - (C) ``(pnorm(280, mean = 250, sd = 20, lower.tail = TRUE) - pnorm(230, mean = 250, sd = 20, lower.tail = TRUE)) / pnorm(230, mean = 250, sd = 20, lower.tail = FALSE)``
   - (D) ``pnorm(280, mean = 250, sd = 20, lower.tail = FALSE) / pnorm(230, mean = 250, sd = 20, lower.tail = FALSE)``

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: (D)**

      We need :math:`P(X > 280 \mid X > 230)`. By the definition of conditional probability:

      .. math::

         P(X > 280 \mid X > 230) = \frac{P(X > 280 \cap X > 230)}{P(X > 230)} = \frac{P(X > 280)}{P(X > 230)}

      since :math:`\{X > 280\} \subseteq \{X > 230\}`. In R:

      .. code-block:: r

         pnorm(280, 250, 20, lower.tail = FALSE) / pnorm(230, 250, 20, lower.tail = FALSE)
         # [1] 0.07941

.. admonition:: Question 2.4  (3 pts)
   :class: note

   A one-way ANOVA with **4 groups** is significant, and the researcher wants to compare **all possible pairs** of means using **Bonferroni**. What is the Bonferroni-adjusted significance level for each comparison if the family-wise error rate is set to 0.05?

   - (A) 0.0500
   - (B) 0.0250
   - (C) 0.0125
   - (D) 0.0083
   - (E) 0.0050

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: (D)**

      With :math:`k = 4` groups, the number of pairwise comparisons is :math:`C = \binom{4}{2} = 6`. The Bonferroni-adjusted significance level is:

      .. math::

         \alpha_{\text{adj}} = \frac{\alpha}{C} = \frac{0.05}{6} \approx 0.0083

.. admonition:: Question 2.5  (3 pts)
   :class: note

   Which of the following is **not required** for a traditional one-way ANOVA?

   - (A) Independent random samples from each population
   - (B) Equal population variances across groups
   - (C) Each of the :math:`k` populations is normally distributed, or sample means are approximately normally distributed
   - (D) Equal sample sizes in all groups

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: (D)**

      The three assumptions for one-way ANOVA are: (1) **independence** — samples are independent random samples from their respective populations; (2) **normality** — each population is normally distributed, or sample sizes are large enough for the CLT to apply; and (3) **equal variances** (homoscedasticity) — the population variances are equal across all groups. **Equal sample sizes** are not required, although balanced designs are preferred for robustness and power.

.. admonition:: Question 2.6  (3 pts)
   :class: note

   A dataset consists of one response variable and four explanatory variables (Variables 1–4). For each explanatory variable, a scatter plot is drawn against the response variable. Select the two explanatory variables whose sample correlation coefficient :math:`r` with the response variable is closest to zero.

   .. figure:: https://yjjpfnblgtrogqvcjaon.supabase.co/storage/v1/object/public/stat-350-assets/images/Exams/FinalExam/SPRING2026/scatter_q2_6.png
      :alt: Four scatter plots of explanatory variables (X1–X5) against response Y. Variable 1 shows a strong positive linear trend. Variable 2 shows a strong positive linear trend with moderate spread. Variable 3 shows a U-shaped (quadratic) pattern with no linear trend. Variable 4 shows a scattered cloud with no clear linear pattern.
      :align: center
      :width: 90%

   - (A) Variables 1 & 2
   - (B) Variables 2 & 3
   - (C) Variables 2 & 4
   - (D) Variables 3 & 4
   - (E) None — all four variables have strong sample correlations with the response variable.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: (D)**

      The sample correlation coefficient :math:`r` measures the **strength and direction of a linear relationship**:

      - **Variable 1:** Strong positive linear trend → :math:`|r|` is high.
      - **Variable 2:** Strong positive linear trend with moderate spread → :math:`|r|` is high.
      - **Variable 3:** U-shaped (quadratic) pattern — a strong *nonlinear* relationship, but virtually no *linear* trend → :math:`|r| \approx 0`.
      - **Variable 4:** Scattered cloud with no clear pattern → :math:`|r| \approx 0`.

      Variables 3 and 4 have sample correlations closest to zero.

---

.. admonition:: Problem 3 Setup
   :class: important

   A utility company, Earl Energy, is known for long customer service wait times. Let :math:`X` denote the waiting time (in hours) until a customer is connected to the next available representative. The probability density function (pdf) and cumulative distribution function (cdf) of :math:`X` are given below.

   .. math::

      f_X(x) = \begin{cases}
         0, & x < 0 \\[4pt]
         \dfrac{1}{2}\,x^2\,e^{-x}, & x \geq 0
      \end{cases}

   .. math::

      F_X(x) = \begin{cases}
         0, & x < 0 \\[4pt]
         1 - \dfrac{1}{2}\,e^{-x}\,(x^2 + 2x + 2), & x \geq 0
      \end{cases}

Problem 3 — Piecewise PDF/CDF (24 points)
-------------------------------------------

.. admonition:: Question 3a  (10 pts)
   :class: note

   What is the probability that a customer waits for more than 30 minutes?

   .. dropdown:: Solution
      :class-container: sd-border-success

      30 minutes = 0.5 hours. We need :math:`P(X > 0.5) = 1 - F_X(0.5)`:

      .. math::

         P\!\left(X > \tfrac{1}{2}\right) = 1 - F_X\!\left(\tfrac{1}{2}\right) = 1 - \left[1 - \tfrac{1}{2}\,e^{-1/2}\left(\left(\tfrac{1}{2}\right)^2 + 2\!\cdot\!\tfrac{1}{2} + 2\right)\right]

      .. math::

         = \tfrac{1}{2}\,e^{-1/2}\left(\tfrac{1}{4} + 1 + 2\right) = \tfrac{1}{2}\,e^{-1/2} \cdot \tfrac{13}{4} = e^{-1/2} \cdot \tfrac{13}{8} \approx \boxed{0.9856}

      .. code-block:: r

         1 - (1 - 0.5 * exp(-0.5) * (0.5^2 + 2*0.5 + 2))
         # [1] 0.9856

.. admonition:: Question 3b  (14 pts)
   :class: note

   Find the variance of a rate :math:`\dfrac{1}{X}` given that :math:`E\!\left[\dfrac{1}{X}\right] = 0.5`.

   .. dropdown:: Solution
      :class-container: sd-border-success

      Using :math:`\text{Var}(1/X) = E[1/X^2] - \left(E[1/X]\right)^2`:

      .. math::

         E\!\left[\frac{1}{X^2}\right] = \int_0^{\infty} \frac{1}{x^2} \cdot \frac{1}{2}\,x^2\,e^{-x}\,dx = \frac{1}{2}\int_0^{\infty} e^{-x}\,dx = \frac{1}{2}

      Therefore:

      .. math::

         \text{Var}\!\left(\frac{1}{X}\right) = E\!\left[\frac{1}{X^2}\right] - \left(E\!\left[\frac{1}{X}\right]\right)^2 = \frac{1}{2} - \left(\frac{1}{2}\right)^2 = \frac{1}{2} - \frac{1}{4} = \boxed{\frac{1}{4} = 0.25}

---

.. admonition:: Problem 4 Setup
   :class: important

   A software team uses Claude Code to assist with code commits. Each commit is independently classified as either routine or novel. A **routine commit** is routed to **Configuration A**, and a **novel commit** is routed to **Configuration B**. Each commit **independently** has a **20% probability of being novel (Configuration B)** and an **80% probability of being routine (Configuration A)**.

   During a particular week, the team makes **25** commits. Each commit is automatically and independently tested for **bugs**. The probability that a **Configuration A commit** contains a **bug** is **0.05**, and the probability that a **Configuration B commit** contains a **bug** is **0.30**.

Problem 4 — Total Probability, Bayes' Rule, Binomial (26 points)
-----------------------------------------------------------------

.. admonition:: Question 4a  (10 pts)
   :class: note

   A **single commit** is selected at random from the week's 25 commits. What is the probability that it contains a bug?

   .. dropdown:: Solution
      :class-container: sd-border-success

      By the **Law of Total Probability**:

      .. math::

         P(\text{Bug}) = P(\text{Bug} \mid A)\,P(A) + P(\text{Bug} \mid B)\,P(B)

      .. math::

         = 0.05 \times 0.8 + 0.30 \times 0.2 = 0.04 + 0.06 = \boxed{0.1}

.. admonition:: Question 4b  (10 pts)
   :class: note

   A **single commit** from the week is found to contain a bug. What is the probability it was handled by **Configuration B**?

   .. dropdown:: Solution
      :class-container: sd-border-success

      By **Bayes' Rule**:

      .. math::

         P(B \mid \text{Bug}) = \frac{P(\text{Bug} \mid B)\,P(B)}{P(\text{Bug})} = \frac{0.30 \times 0.2}{0.1} = \frac{0.06}{0.1} = \boxed{0.6}

.. admonition:: Question 4c  (6 pts)
   :class: note

   Find the expected number and standard deviation of bugs found in **Configuration B** during the week.

   .. dropdown:: Solution
      :class-container: sd-border-success

      Let :math:`X_B` denote the number of bugs found in Configuration B during the week. Each of the 25 commits independently has probability :math:`P(\text{Bug} \cap B) = P(\text{Bug} \mid B)\,P(B) = 0.30 \times 0.20 = 0.06` of being a Configuration B bug. Therefore:

      .. math::

         X_B \sim \text{Binomial}(n = 25,\; p = 0.06)

      .. math::

         E[X_B] = np = 25 \times 0.06 = \boxed{1.5}

      .. math::

         \text{Var}(X_B) = np(1-p) = 25 \times 0.06 \times 0.94 = 1.41

      .. math::

         \text{SD}(X_B) = \sqrt{1.41} \approx \boxed{1.1874}

---

.. admonition:: Problem 5 Setup
   :class: important

   An agronomist wants to compare the average plant height increase (in cm) produced by **four fertilizers**. A random sample of plants was assigned to each fertilizer treatment. The summary information is given below.

   .. list-table::
      :header-rows: 1
      :widths: 25 15 20 20

      * - Group
        - :math:`n_i`
        - :math:`\bar{x}_i`
        - :math:`s_i`
      * - Fertilizer 1
        - 10
        - 17.80
        - 2.05
      * - Fertilizer 2
        - 10
        - 20.70
        - 2.31
      * - Fertilizer 3
        - 10
        - 19.25
        - 2.18
      * - Fertilizer 4
        - 10
        - 24.00
        - 2.42

Problem 5 — One-Way ANOVA, Tukey HSD (40 points)
--------------------------------------------------

.. admonition:: Question 5a  (2 pts)
   :class: note

   Using the summary statistics, assess whether the equal variance (homogeneity of variance) assumption appears reasonable. Show your work and state your conclusion clearly. For the rest of the problem, assume all other ANOVA assumptions are satisfied.

   .. dropdown:: Solution
      :class-container: sd-border-success

      Check the ratio of largest to smallest sample standard deviations:

      .. math::

         \frac{s_{\max}}{s_{\min}} = \frac{2.42}{2.05} \approx 1.1805 < 2

      By the rule of thumb, the equal variance assumption is reasonable.

.. admonition:: Question 5b  (14 pts)
   :class: note

   Complete the ANOVA table below. The Factor Sum of Squares is 211.2687, the Error Sum of Squares is 181.3266, and the :math:`p`-value is :math:`3.351353 \times 10^{-6}`.

   .. dropdown:: Solution
      :class-container: sd-border-success

      With :math:`k = 4` groups and :math:`N = 40` total observations:

      .. math::

         \bar{\bar{x}} = \frac{10(17.80) + 10(20.70) + 10(19.25) + 10(24.00)}{40} = \frac{817.5}{40} = 20.4375

      **Degrees of freedom:**

      - Factor: :math:`k - 1 = 3`
      - Error: :math:`N - k = 36`
      - Total: :math:`N - 1 = 39`

      **Mean Squares:**

      .. math::

         \text{MSA} = \frac{\text{SSA}}{k-1} = \frac{211.2687}{3} = 70.4229

      .. math::

         \text{MSE} = \frac{\text{SSE}}{N-k} = \frac{181.3266}{36} = 5.0369

      **F-statistic:**

      .. math::

         F = \frac{\text{MSA}}{\text{MSE}} = \frac{70.4229}{5.0369} = 13.9815

      .. list-table:: ANOVA Table
         :header-rows: 1
         :widths: 18 15 18 16 16 17

         * - Source
           - df
           - Sum of Squares
           - Mean Square
           - :math:`F`
           - :math:`\Pr(>F)`
         * - Factor
           - 3
           - 211.2687
           - 70.4229
           - 13.9815
           - :math:`3.35 \times 10^{-6}`
         * - Error
           - 36
           - 181.3266
           - 5.0369
           -
           -
         * - Total
           - 39
           - 392.5953
           -
           -
           -

.. admonition:: Question 5c  (4 pts)
   :class: note

   Provide the first two steps of the four-step one-way ANOVA hypothesis testing procedure.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Step 1 — Identify and describe the parameter(s):**

      Let :math:`\mu_1, \mu_2, \mu_3, \mu_4` denote the true mean height increase in cm treated with Fertilizers 1, 2, 3, and 4, respectively.

      **Step 2 — Define the hypotheses:**

      .. math::

         H_0\colon \mu_1 = \mu_2 = \mu_3 = \mu_4

      .. math::

         H_a\colon \text{At least one mean is different}

.. admonition:: Question 5d  (3 pts)
   :class: note

   Which of the following R code statements returns the correct :math:`p`-value?

   - (A) ``pf(F_ts/2, df1=4, df2 = 36, lower.tail = FALSE)``
   - (B) ``pf(F_ts, df1=3, df2 = 36, lower.tail = FALSE)``
   - (C) ``pf(F_ts, df1=4, df2 = 37, lower.tail = TRUE)``
   - (D) ``pf(F_ts, df1=3, df2 = 36, lower.tail = TRUE)``
   - (E) ``2*pf(F_ts, df1=4, df2 = 40, lower.tail = FALSE)``

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: (B)**

      The ANOVA :math:`F`-test is always **upper-tailed**. The :math:`p`-value is :math:`P(F > F_{\text{TS}})` with numerator df :math:`= k - 1 = 3` and denominator df :math:`= N - k = 36`:

      .. code-block:: r

         pf(F_ts, df1 = 3, df2 = 36, lower.tail = FALSE)
         # [1] 3.351353e-06

.. admonition:: Question 5e  (8 pts)
   :class: note

   The calculated :math:`p`-value is :math:`3.35 \times 10^{-6}`. At a significance level of :math:`\alpha = 0.05`, state your formal decision and conclusion in the context of the problem.

   .. dropdown:: Solution
      :class-container: sd-border-success

      Since the :math:`p`-value :math:`= 3.35 \times 10^{-6} < 0.05 = \alpha`, we have evidence to reject :math:`H_0`.

      The data **does** give **strong** support (:math:`p`-value :math:`= 3.35 \times 10^{-6}`) to the claim that **at least one** of the fertilizers produces a **true mean height increase** (in centimeters) that differs from **at least one other**.

.. admonition:: Question 5f  (4 pts)
   :class: note

   Based on your conclusion in part (e), is it appropriate to proceed to pairwise comparisons such as Tukey's HSD? Briefly explain.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Yes.** Since we rejected the null hypothesis, we have evidence that at least one pair of means differs. To determine **which** means are different, a post-hoc analysis such as Tukey's HSD should be conducted.

.. admonition:: Question 5g  (5 pts)
   :class: note

   The following Tukey HSD results were obtained. Construct a graphical display based on these results, and briefly state which fertilizer appears to have the largest population mean plant height increase and provide justification.

   .. list-table::
      :header-rows: 1
      :widths: 35 15 15 15 15

      * - Comparison
        - diff
        - lwr
        - upr
        - p adj
      * - Fertilizer 2 − Fertilizer 1
        - 2.9000
        - 0.1969
        - 5.6031
        - 0.0306
      * - Fertilizer 3 − Fertilizer 1
        - 1.4500
        - −1.2531
        - 4.1531
        - 0.4807
      * - Fertilizer 4 − Fertilizer 1
        - 6.2000
        - 3.4969
        - 8.9031
        - 0.0000
      * - Fertilizer 3 − Fertilizer 2
        - −1.4500
        - −4.1531
        - 1.2531
        - 0.4807
      * - Fertilizer 4 − Fertilizer 2
        - 3.3000
        - 0.5969
        - 6.0031
        - 0.0118
      * - Fertilizer 4 − Fertilizer 3
        - 4.7500
        - 2.0469
        - 7.4531
        - 0.0002

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Significant pairs** (p adj < 0.05): (2−1), (4−1), (4−2), (4−3).

      **Non-significant pairs** (p adj ≥ 0.05): (3−1) with p = 0.4807, and (3−2) with p = 0.4807.

      **Underline display** (groups ordered by sample mean; groups connected by the same underline are **not** significantly different):

      .. figure:: https://yjjpfnblgtrogqvcjaon.supabase.co/storage/v1/object/public/stat-350-assets/images/Exams/FinalExam/SPRING2026/tukey_display.png
         :alt: Tukey HSD underline display. Four group means in ascending order — Fertilizer 1 (17.80), Fertilizer 3 (19.25), Fertilizer 2 (20.70), Fertilizer 4 (24.00). One underline connects Fertilizer 1 and Fertilizer 3 (p adj = 0.4807). A second underline connects Fertilizer 3 and Fertilizer 2 (p adj = 0.4807). Fertilizer 4 has no underline connecting it to any other group.
         :align: center
         :width: 70%

      Two overlapping underlines: one connecting Fertilizer 1–Fertilizer 3, and a second connecting Fertilizer 3–Fertilizer 2. Fertilizer 4 stands alone — significantly different from all others.

      **Fertilizer 4** appears to have the **largest** population mean height increase (:math:`\bar{x}_4 = 24.00`), and it is significantly higher than every other fertilizer at the 5% family-wise level.

---

.. admonition:: Problem 6 Setup
   :class: important

   A driving school wants to estimate the monthly car insurance premium for teenage drivers who are at least 16 but under 20 years old (all with minimum-coverage policies). They randomly selected 100 teen drivers and recorded each driver's monthly premium (in dollars) and age (in years) at enrollment. Preliminary analysis indicates a linear relationship between monthly premium (:math:`y`) and age (:math:`x`). The school plans to fit a simple linear regression model to provide statistical estimates of monthly premiums based on age.

   .. list-table::
      :widths: 33 33 34

      * - :math:`S_{xx} = 63.797`
        - :math:`S_{xy} = -538.3375`
        - :math:`S_{yy} = 24069`
      * - :math:`\bar{x} = 17.5535`
        - :math:`\bar{y} = 204.2009`
        - :math:`n = 100`

Problem 6 — Simple Linear Regression (37 points)
--------------------------------------------------

.. admonition:: Question 6a  (10 pts)
   :class: note

   The simple linear regression model requires four assumptions. Not all assumptions are needed at every stage of the analysis pipeline.

   i.   State the four assumptions.
   ii.  For each assumption, identify the stage at which it is first required: model fitting/estimation, statistical inference, or prediction intervals.
   iii. Explain why prediction intervals are not robust to the violation of the assumption identified in ii.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **i. The four assumptions (LINE):**

      - **Linearity:** The true relationship between :math:`x` and :math:`y` is linear, meaning :math:`E[Y \mid x] = \beta_0 + \beta_1 x`. Part of model fitting/estimation.
      - **Independence:** The error terms :math:`\varepsilon_1, \varepsilon_2, \ldots, \varepsilon_n` are independent of one another. Needed for proper statistical inference and prediction intervals.
      - **Normality:** The error terms :math:`\varepsilon_1, \varepsilon_2, \ldots, \varepsilon_n` are normally distributed with zero mean. Needed for prediction intervals, though standard statistical inference (CIs and tests for :math:`\beta_0, \beta_1`) can rely on the CLT.
      - **Equal Variance (Homoscedasticity):** The variance of the errors is constant across all values of :math:`x`. Needed for statistical inference and prediction intervals.

      **ii. Stage identification:**

      Normality is the assumption whose violation most directly impacts **prediction intervals** while being somewhat robust at the inference stage (via the CLT).

      **iii. Why prediction intervals are not robust to normality violations:**

      A **confidence interval** for the **mean response** (:math:`E[Y \mid x_0]`) benefits from the CLT: since the estimates of the slope and intercept are averages (weighted) of many observations, their sampling distributions become approximately normal even when the errors are not, provided :math:`n` is reasonably large.

      A **prediction interval**, however, must account for the variability of a **single future observation** :math:`Y_0 = \beta_0 + \beta_1 x_0 + \varepsilon_0` as well as that of the estimate of the mean response. The interval's coverage depends directly on the distribution of that individual error term :math:`\varepsilon_0` — there is no averaging and no CLT to rescue us. If the errors are skewed or heavy-tailed, the interval endpoints (constructed assuming normality) will be in the wrong places, and the stated coverage probability (e.g., 95%) will be incorrect.

.. admonition:: Question 6b  (8 pts)
   :class: note

   Assuming all assumptions are met, compute the slope :math:`b_1` and the intercept :math:`b_0`. Write the fitted regression line :math:`\hat{y}`.

   .. dropdown:: Solution
      :class-container: sd-border-success

      .. math::

         b_1 = \frac{S_{xy}}{S_{xx}} = \frac{-538.3375}{63.797} = -8.4383

      .. math::

         b_0 = \bar{y} - b_1\,\bar{x} = 204.2009 - (-8.4383)(17.5535) = 352.3226

      **Regression line:**

      .. math::

         \hat{y} = 352.3226 - 8.4383\,x

.. admonition:: Question 6c  (8 pts)
   :class: note

   Predict the monthly premium for a 17-year-old teen and a 13-year-old teen, respectively. Discuss the statistical validity of these predictions.

   .. dropdown:: Solution
      :class-container: sd-border-success

      Plug in :math:`x = 17` and :math:`x = 13` to the regression line:

      .. math::

         \hat{y}_{17} = 352.3226 - 8.4383 \times 17 = 352.3226 - 143.4511 = 208.8715

      .. math::

         \hat{y}_{13} = 352.3226 - 8.4383 \times 13 = 352.3226 - 109.6979 = 242.6247

      The prediction for a **17-year-old** is statistically valid, because 17 falls within the range of observed ages (16 to under 20). This is an **interpolation**.

      The prediction for a **13-year-old** is **not** statistically valid, because 13 falls outside the observed age range. This is an **extrapolation**, and the linear model may not accurately reflect premiums for ages not represented in the data.

.. admonition:: Question 6d  (8 pts)
   :class: note

   Construct a 95% confidence interval for the mean monthly premium of all 17-year-old drivers. Use the R output below, along with the summary statistics from the problem introduction and your fitted regression model.

   ::

      Residual standard error: 14.12 on 98 degrees of freedom
      Multiple R-squared: 0.1887, Adjusted R-squared: 0.1804
      F-statistic: 22.8 on 1 and 98 DF, p-value: 6.296e-06

   .. list-table::
      :widths: 50 50

      * - ``qt(0.025, 98, lower.tail=FALSE)``
          ``[1] 1.984467``
        - ``qf(0.025, 1, 98, lower.tail=FALSE)``
          ``[1] 5.181823``
      * - ``qt(0.05, 98, lower.tail=FALSE)``
          ``[1] 1.660551``
        - ``qf(0.05, 1, 98, lower.tail=FALSE)``
          ``[1] 3.938111``

   .. dropdown:: Solution
      :class-container: sd-border-success

      Compute the **standard error of the mean prediction** at :math:`x^* = 17`:

      .. math::

         SE(\hat{y}_{17}) = s\,\sqrt{\frac{1}{n} + \frac{(x^* - \bar{x})^2}{S_{xx}}} = 14.12\,\sqrt{\frac{1}{100} + \frac{(17 - 17.5535)^2}{63.797}}

      .. math::

         = 14.12\,\sqrt{0.01 + 0.004802} = 14.12 \times 0.12167 \approx 1.7179

      The 95% CI is :math:`\hat{y}_{17} \pm t_{0.025,\,df=98} \times SE(\hat{y}_{17})`:

      .. math::

         208.8715 \pm 1.984467 \times 1.7179 = 208.8715 \pm 3.4093

      .. math::

         \approx \boxed{(205.46,\; 212.28)}

.. admonition:: Question 6e  (3 pts)
   :class: note

   Which of the following statements is reasonable regarding an interval estimate for a **new response** :math:`x^*`?

   - (A) The confidence interval for :math:`y^*` becomes wider if :math:`x^*` moves farther away from the sample mean :math:`\bar{x}`.
   - (B) The confidence interval for :math:`y^*` becomes narrower if :math:`x^*` moves farther away from the sample mean :math:`\bar{x}`.
   - (C) The prediction interval for :math:`y^*` becomes wider if :math:`x^*` moves farther away from the sample mean :math:`\bar{x}`.
   - (D) The prediction interval for :math:`y^*` becomes narrower if :math:`x^*` moves farther away from the sample mean :math:`\bar{x}`.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: (C)**

      Both confidence intervals (for the mean response) and prediction intervals (for a new observation) contain the term :math:`(x^* - \bar{x})^2 / S_{xx}` inside the square root of their standard error formulas. As :math:`x^*` moves farther from :math:`\bar{x}`, this term increases, making the interval wider. The question asks about a **new response**, which uses a **prediction interval**. Option (C) correctly describes this behavior.