.. _exam1-fall2024:

Exam 1 — Fall 2024: Fully Worked Solutions
============================================================

- `Exam PDF <https://treese41528.github.io/STAT350/Exam_Resources/Exam1/Exam_1_Fall_2024.pdf>`_
- `Solution PDF <https://treese41528.github.io/STAT350/Exam_Resources/Exam1/Exam_1_Fall_2024_Key.pdf>`_

The questions below reproduce the Fall 2024 Exam 1 in full accessible text. Each problem is
followed by a complete worked solution. Point values reflect the actual exam.

.. list-table:: Point Summary
   :widths: 40 30 30
   :header-rows: 1

   * - Section
     - Format
     - Points
   * - Problem 1 — True/False
     - 6 questions × 2 pts
     - 12
   * - Problem 2 — Multiple Choice
     - 5 questions × 3 pts
     - 15
   * - Problem 3 — Free Response
     - 4 parts
     - 20
   * - Problem 4 — Free Response
     - 5 parts
     - 26
   * - Problem 5 — Free Response
     - 4 parts
     - 32
   * - **Total**
     -
     - **105**

----

Problem 1: True/False  (12 points, 2 points each)
------------------------------------------------------------------

Indicate the correct answer by completely filling in the appropriate circle. If you indicate
your answer by any other way, you may be marked incorrect.

----

.. admonition:: Question 1.1  (2 pts)
   :class: note

   Employees in a certain UPS branch collected the types of mail customers brought for a
   month. They plan to present the data appropriately to the manager and discuss how to
   utilize empty space efficiently.

   **T or F:** A histogram is appropriate to use because the variable is categorical.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: FALSE**

      The variable "type of mail" is **categorical** (qualitative). Histograms are designed
      for **quantitative** (numerical) data — they display the distribution of a numerical
      variable by grouping values into bins. For categorical data, the appropriate displays
      are bar charts or pie charts, which show frequencies or proportions for each category.
      The statement is **FALSE**.

----

.. admonition:: Question 1.2  (2 pts)
   :class: note

   A hardware manufacturer is about to ship 20,000 of its products to a client. To estimate
   the defect rate of this shipment, they randomly selected 100 products for a last-minute
   inspection. For each product, they assign a value of 0 if the product is good and 1 if it
   is defective. The defect rate is then calculated as the average of these 0's and 1's.

   **T or F:** If the company had the resources to inspect all 20,000 products, the defect
   rate calculated using all 20,000 products would represent a sample statistic.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: FALSE**

      If all 20,000 products in the shipment were inspected, the defect rate would be
      computed from the **entire population** of interest (the shipment). A quantity computed
      from the entire population is a **population parameter**, not a sample statistic. A
      sample statistic is computed from a subset (sample) of the population. The statement is
      **FALSE**.

----

.. admonition:: Question 1.3  (2 pts)
   :class: note

   Suppose the number of visitors to a mall follows a Poisson distribution with an average
   rate of 45 visitors per 30 minutes.

   **T or F:** In this mall, the variance in the number of visitors arriving between 2:00 PM
   and 3:00 PM is equal to the variance in the number of visitors arriving between 3:00 PM
   and 5:00 PM.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: FALSE**

      For a Poisson process with rate :math:`\lambda` visitors per 30 minutes, the number of
      visitors in a time interval of length :math:`t` (in 30-minute units) follows
      :math:`\text{Poisson}(\lambda t)`, which has variance :math:`\lambda t`.

      - **2:00 PM to 3:00 PM** is 1 hour = 2 thirty-minute periods:
        :math:`\text{Var} = 45 \times 2 = 90` visitors².

      - **3:00 PM to 5:00 PM** is 2 hours = 4 thirty-minute periods:
        :math:`\text{Var} = 45 \times 4 = 180` visitors².

      The variances are not equal. The statement is **FALSE**.

----

.. admonition:: Question 1.4  (2 pts)
   :class: note

   Let :math:`V` be a random variable with a probability density function :math:`f_V(v)`
   that is nonzero only on the interval :math:`[-5, -2)`. Let :math:`F_V(\cdot)` denote the
   cumulative distribution function (CDF) of :math:`V`.

   **T or F:** Then, :math:`F_V(c) = 1` holds for any :math:`c > 0`.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: TRUE**

      The support of :math:`V` is entirely contained in :math:`[-5, -2)`. For any
      :math:`c > 0`, since :math:`0 > -2`, the value :math:`c` lies strictly to the right of
      the entire support. The CDF at :math:`c` accumulates all probability mass in
      :math:`(-\infty, c]`, which includes the entire support :math:`[-5, -2)`.
      Therefore :math:`F_V(c) = 1` for all :math:`c > 0`. The statement is **TRUE**.

----

.. admonition:: Question 1.5  (2 pts)
   :class: note

   A student scored 85 on two different math exams. For Exam 1, the mean score is 75 with
   a standard deviation of 5, and for Exam 2, the mean score is 70 with a standard
   deviation of 10.

   **T or F:** The student performed better on Exam 1 compared to Exam 2.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: TRUE**

      Compute the z-score for each exam to compare relative performance:

      .. math::

         z_{\text{Exam 1}} = \frac{85 - 75}{5} = \frac{10}{5} = 2.00,

      .. math::

         z_{\text{Exam 2}} = \frac{85 - 70}{10} = \frac{15}{10} = 1.50.

      Since :math:`z_{\text{Exam 1}} = 2.00 > 1.50 = z_{\text{Exam 2}}`, the student scored
      2 standard deviations above the mean on Exam 1 but only 1.5 standard deviations above
      the mean on Exam 2. The student performed relatively better on Exam 1. The statement is
      **TRUE**.

----

.. admonition:: Question 1.6  (2 pts)
   :class: note

   For the figure below,

   .. figure:: https://yjjpfnblgtrogqvcjaon.supabase.co/storage/v1/object/public/stat-350-assets/images/Exams/Exam1/FALL2024/image1.png
      :alt: Two overlapping normal distribution curves on the same axis spanning 0 to 100. The blue curve is tall and narrow, centered near x = 50 with a small standard deviation. The red curve is short and wide, also centered near x = 50 but with a much larger standard deviation. Both curves are symmetric and unimodal.
      :align: center
      :width: 55%

   **T or F:** the blue normal distribution has more area underneath its curve than the red
   normal distribution does.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: FALSE**

      Every normal distribution, regardless of its mean :math:`\mu` or standard deviation
      :math:`\sigma`, has **total area equal to 1** under its curve. This is a fundamental
      property of all probability density functions. The blue curve is taller and narrower
      (smaller :math:`\sigma`) while the red curve is shorter and wider (larger
      :math:`\sigma`), but both enclose exactly the same total area of 1. The statement is
      **FALSE**.


----

Problem 2: Multiple Choice  (15 points, 3 points each)
------------------------------------------------------------------

Indicate the correct answer by completely filling in the appropriate circle. If you indicate
your answer by any other way, you may be marked incorrect. For each question, there is only
one correct option letter choice.

----

.. admonition:: Question 2.1  (3 pts)
   :class: note

   The number of customers arriving at a UPS branch during working hours follows a Poisson
   distribution with an average rate of **4 customers per hour**. Let :math:`X` denote the
   number of customers arriving between **9:00 AM and 10:00 AM** and let :math:`Y` denote
   the number of customers arriving between **10:30 AM and 12:00 PM**.

   What is the **conditional probability** that exactly **3 customers** arrive between
   **10:30 AM and 12:00 PM**, given that **6 customers** arrived between **9:00 AM and
   10:00 AM**?

   - **(A)** :math:`P(Y = 3 \mid X = 6) = 0`
   - **(B)** :math:`P(Y = 3 \mid X = 6) = 0.0093`
   - **(C)** :math:`P(Y = 3 \mid X = 6) = 0.0892`
   - **(D)** :math:`P(Y = 3 \mid X = 6) = 0.1954`
   - **(E)** :math:`P(Y = 3 \mid X = 6) = 0.8564`

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: (C)**

      The intervals 9:00–10:00 AM and 10:30 AM–12:00 PM do not overlap. For a Poisson
      process, arrivals in non-overlapping intervals are **independent**. Therefore:

      .. math::

         P(Y = 3 \mid X = 6) = P(Y = 3).

      The interval 10:30 AM–12:00 PM is **1.5 hours** long. With a rate of 4 customers
      per hour:

      .. math::

         Y \sim \text{Poisson}(\lambda = 4 \times 1.5) = \text{Poisson}(6).

      .. math::

         P(Y = 3) = \frac{e^{-6} \cdot 6^3}{3!} = \frac{e^{-6} \cdot 216}{6}
                  = 36\,e^{-6} \approx 36 \times 0.0025 = \boxed{0.0892}.

      The answer is **(C)**.

----

.. admonition:: Question 2.2  (3 pts)
   :class: note

   The time between customer arrivals at the same UPS facility follows an exponential
   distribution with an average of 15 minutes between customer arrivals. Let :math:`T`
   denote the time between customer arrivals. If no customer has arrived in the last 20
   minutes, what is the probability that the next customer arrives after waiting more than
   15 additional minutes.

   - **(A)** :math:`P(T > 35 \mid T > 20) = 0`
   - **(B)** :math:`P(T > 35 \mid T > 20) = 0.097`
   - **(C)** :math:`P(T > 35 \mid T > 20) = 0.2636`
   - **(D)** :math:`P(T > 35 \mid T > 20) = 0.3679`
   - **(E)** :math:`P(T > 35 \mid T > 20) = 0.6321`

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: (D)**

      The Exponential distribution has the **memoryless property**:

      .. math::

         P(T > s + t \mid T > s) = P(T > t) \quad \text{for all } s, t > 0.

      Applying this with :math:`s = 20` and :math:`t = 15`:

      .. math::

         P(T > 35 \mid T > 20) = P(T > 15).

      Since :math:`T \sim \text{Exponential}\!\left(\lambda = \dfrac{1}{15}\right)`:

      .. math::

         P(T > 15) = e^{-15/15} = e^{-1} \approx \boxed{0.3679}.

      The answer is **(D)**.

----

.. admonition:: Question 2.3  (3 pts)
   :class: note

   Suppose :math:`X \sim \text{Binomial}(n = 10,\; p = 0.1)` and
   :math:`Y \sim \text{Binomial}(n = 10,\; p = 0.9)`.

   Which statement is **not always true** about :math:`X` and :math:`Y`?

   - **(A)** The mode of :math:`X` is less than the mode of :math:`Y`.
   - **(B)** :math:`\text{SD}(X) - |\sqrt{\text{Var}(Y)}| = 0`
   - **(C)** :math:`P(X = 1 \cap Y = 8) = 0.1943`
   - **(D)** :math:`E[X^2] = (10)(0.1)(0.9) + [(10)(0.1)]^2`
   - **(E)** :math:`P(X = 1) = P(Y = 9)`

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: (A)**

      Evaluate each option:

      **(A)** For :math:`X \sim \text{Bin}(10, 0.1)`, the mode is :math:`\lfloor(n+1)p\rfloor = \lfloor 1.1 \rfloor = 1`. For :math:`Y \sim \text{Bin}(10, 0.9)`, the mode is :math:`\lfloor(n+1)(0.9)\rfloor = \lfloor 9.9 \rfloor = 9`. The mode of :math:`X` is 1 and the mode of :math:`Y` is 9, and :math:`1 < 9`. While this appears true for these specific distributions, the Binomial mode is not always strictly less than or greater than another Binomial mode in general — it depends on the specific parameters and whether modes are unique.

      **(B) Always true.** :math:`\text{SD}(X) = \sqrt{np(1-p)} = \sqrt{10(0.1)(0.9)} = \sqrt{0.9}`. Similarly :math:`\sqrt{\text{Var}(Y)} = \sqrt{10(0.9)(0.1)} = \sqrt{0.9}`. Their difference is 0.

      **(C) Never true.** If :math:`X` and :math:`Y` are independent, :math:`P(X=1 \cap Y=8) = P(X=1) \cdot P(Y=8) \approx 0.3874 \times 0.1937 \approx 0.0750 \neq 0.1943`.

      **(D) Always true.** Using :math:`E[X^2] = \text{Var}(X) + (E[X])^2 = np(1-p) + (np)^2 = (10)(0.1)(0.9) + [(10)(0.1)]^2`.

      **(E) Always true.** By symmetry of the Binomial: :math:`P(X = k) = P(Y = n-k)`, so :math:`P(X=1) = P(Y=9)`.

      The answer is **(A)**.

----

.. admonition:: Question 2.4  (3 pts)
   :class: note

   Suppose :math:`X` is a random variable with :math:`E[e^X] = 2` and
   :math:`\text{Var}(e^X) = 5`, and :math:`Y` is a random variable independent of
   :math:`X`, satisfying :math:`E(Y) = -10`, :math:`\text{Var}(Y) = 3`. What is
   :math:`E\!\left[(e^X - 3Y)^2\right]`?

   - **(A)** 1056
   - **(B)** 240
   - **(C)** 1024
   - **(D)** -752
   - **(E)** None of the above

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: (A)**

      Expand the square:

      .. math::

         E\!\left[(e^X - 3Y)^2\right]
         = E[e^{2X}] - 6\,E[e^X Y] + 9\,E[Y^2].

      **Find each term:**

      :math:`E[e^{2X}]`: using :math:`\text{Var}(e^X) = E[e^{2X}] - (E[e^X])^2`:

      .. math::

         E[e^{2X}] = \text{Var}(e^X) + (E[e^X])^2 = 5 + 4 = 9.

      :math:`E[e^X Y]`: since :math:`X` and :math:`Y` are independent:

      .. math::

         E[e^X Y] = E[e^X] \cdot E[Y] = 2 \times (-10) = -20.

      :math:`E[Y^2]`: using :math:`\text{Var}(Y) = E[Y^2] - (E[Y])^2`:

      .. math::

         E[Y^2] = \text{Var}(Y) + (E[Y])^2 = 3 + 100 = 103.

      **Combine:**

      .. math::

         E\!\left[(e^X - 3Y)^2\right] = 9 - 6(-20) + 9(103) = 9 + 120 + 927 = \boxed{1056}.

      The answer is **(A)**.

----

.. admonition:: Question 2.5  (3 pts)
   :class: note

   The figure below shows the shape of the distribution for two continuous random variables
   :math:`X` and :math:`Y`.

   .. figure:: https://yjjpfnblgtrogqvcjaon.supabase.co/storage/v1/object/public/stat-350-assets/images/Exams/Exam1/FALL2024/image2.png
      :alt: Two side-by-side histograms. Left panel titled "Distribution of X": right-skewed histogram with the tallest bars on the left side and a long right tail; y-axis labeled Probability up to 1.2. Right panel titled "Distribution of Y": roughly symmetric, bell-shaped histogram; y-axis labeled Probability up to 0.08.
      :align: center
      :width: 80%

   Which of the following statements is TRUE about the random variable :math:`X`?

   - **(A)** The mean is a better measure of central tendency than median.
   - **(B)** The distance between :math:`Q_3` and the median is narrower than the distance
     between :math:`Q_1` and the median.
   - **(C)** IQR is a robust (resistant) measure of the spread.
   - **(D)** The distribution is negatively skewed with one peak.
   - **(E)** The mode will have the largest value among all the measures of central tendency.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: (C)**

      The distribution of :math:`X` is **right-skewed** (positively skewed) with a long
      right tail. Evaluate each option:

      **(A) FALSE.** For a right-skewed distribution, the mean is pulled toward the long
      right tail and is not resistant to extreme values. The **median** is a better (more
      resistant) measure of central tendency than the mean for skewed data.

      **(B) FALSE.** For a right-skewed distribution, the bulk of the data is concentrated
      on the left, so the right half of the box (Q₃ to median) is typically *wider* than the
      left half (Q₁ to median). The distance from :math:`Q_3` to the median is **not**
      narrower.

      **(C) TRUE.** The IQR is based on the middle 50% of the data and is not affected by
      extreme values or outliers in the tails. It is a **robust (resistant) measure of
      spread**, regardless of the shape of the distribution.

      **(D) FALSE.** The histogram of :math:`X` shows a right tail (positively skewed), not
      negatively skewed.

      **(E) FALSE.** For a right-skewed distribution, the ordering of measures of central
      tendency is Mode < Median < Mean. The mode has the **smallest** value, not the largest.

      The answer is **(C)**.


----

Free Response Questions 3–5
------------------------------------------------------------------

Show all work, clearly label your answers, and use four decimal places.

Problem 3  (20 points)
------------------------------------------------------------------

.. admonition:: Problem 3 Setup
   :class: important

   The stated speed limit on I-65 is 65 mph. The speeds of vehicles along a certain stretch
   of I-65 follow an approximately normal distribution with a mean of 71 mph and a standard
   deviation of 8 mph.

   Let :math:`V` denote the speed of a random vehicle on I-65.

   .. math::

      V \sim \text{Normal}(\mu = 71,\; \sigma = 8).

----

.. admonition:: Question 3a  (2 pts)
   :class: note

   What is the probability that the speed of a vehicle on this stretch of I-65 is below
   :math:`\mu + 3\sigma`?

   .. dropdown:: Solution
      :class-container: sd-border-success

      Simply using the Empirical Rule:

      .. math::

         P(V < \mu + 3\sigma) \approx \boxed{0.9985}.

----

.. admonition:: Question 3b  (2 pts)
   :class: note

   Calculate the z-score for the stated speed limit of 65 mph.

   .. dropdown:: Solution
      :class-container: sd-border-success

      .. math::

         z = \frac{x - \mu}{\sigma} = \frac{65 - 71}{8} = \boxed{-0.75}.

----

.. admonition:: Question 3c  (8 pts)
   :class: note

   What is the probability that a vehicle's speed is between 61 mph and 71 mph on this
   stretch of I-65?

   .. dropdown:: Solution
      :class-container: sd-border-success

      .. math::

         P(61 < V < 71) = P\!\left(\frac{61 - 71}{8} < Z < \frac{71 - 71}{8}\right)
                        = P(-1.25 < Z < 0).

      Using the z-table and symmetry:

      .. math::

         P(-1.25 < Z < 0) = 0.5 - \Phi(-1.25) = 0.5 - 0.1056 = \boxed{0.3944}.

----

.. admonition:: Question 3d  (8 pts)
   :class: note

   State patrol officers will issue radar tickets to vehicles whose speeds are in the top 4%
   of this distribution. What is the speed cutoff for issuing tickets?

   .. dropdown:: Solution
      :class-container: sd-border-success

      The top 4% corresponds to the **96th percentile**.

      From the z-table: :math:`\Phi(1.75) = 0.9599 \approx 0.96`, so :math:`z = 1.75`.

      Transform to the distribution of car speeds on I-65:

      .. math::

         v_{0.96} = \mu + z \times \sigma = 71 + 1.75 \times 8 = 71 + 14 = \boxed{85 \text{ mph}}.

      The cutoff for the top 4% of vehicle speeds on I-65 is **85 miles per hour**.


----

Problem 4  (26 points)
------------------------------------------------------------------

.. admonition:: Problem 4 Setup
   :class: important

   Kristin, a data science major, is working on a term project to build a predictive model
   that can classify images of handwritten digits (0–4).

   .. figure:: https://yjjpfnblgtrogqvcjaon.supabase.co/storage/v1/object/public/stat-350-assets/images/Exams/Exam1/FALL2024/image3.png
      :alt: Six example images of handwritten digits. Top row: label=0 showing a handwritten zero, label=4 showing a handwritten four, label=1 showing a handwritten one. Bottom row: label=1, label=3, label=1.
      :align: center
      :width: 65%

   She has a dataset containing 1600 images, each displaying a single digit. Kristin divided
   the dataset into a **training set** of **1000 images** and a **test set** of **600
   images**. The training set is used to teach the model, while the test set is used to
   evaluate its performance.

   After training, Kristin used the test set to create a **confusion matrix**, which shows
   the number of **correctly** and **incorrectly classified images**. In the matrix below,
   **rows** indicate the **actual labels (ground truth)**, and **columns represent** the
   **predicted labels** made by the **model**:

   .. flat-table:: Confusion Matrix
      :header-rows: 2
      :widths: 15 13 12 12 12 12 12 12

      * - :rspan:`1` True Label
        - :cspan:`6` **Predicted Label**
      * - **Digits**
        - **0**
        - **1**
        - **2**
        - **3**
        - **4**
        - **Total**
      * - **0**
        - 107
        - 0
        - 0
        - 1
        - 8
        - 116
      * - **1**
        - 0
        - 117
        - 1
        - 0
        - 4
        - 122
      * - **2**
        - 0
        - 4
        - 92
        - 11
        - 1
        - 108
      * - **3**
        - 3
        - 1
        - 15
        - 112
        - 1
        - 132
      * - **4**
        - 4
        - 0
        - 0
        - 4
        - 114
        - 122
      * - **Total**
        - 114
        - 122
        - 108
        - 128
        - 128
        - **600**

   **Reading the Table:** The highlighted cell with the value 117 indicates that the model
   correctly predicted the digit '1' for 117 images that had True Label as '1'. This number
   represents the model's accurate classifications for the digit '1' in the test set.

   All questions below refer to the data presented in the confusion matrix (table).

----

.. admonition:: Question 4a  (3 pts)
   :class: note

   Define the events:

   - :math:`E_1 = \{\text{true label is 4}\}`
   - :math:`E_2 = \{\text{true label is 1 or 2}\}`
   - :math:`E_3 = \{\text{predicted label is 0}\}`

   Which of the following statements is TRUE?

   - **(A)** Two events :math:`E_1` and :math:`E_3` are mutually exclusive.
   - **(B)** :math:`P(E_1 \cap E_3) = P(E_1)\,P(E_3)`.
   - **(C)** Two events :math:`E_1` and :math:`E_2` are disjoint.
   - **(D)** :math:`P(E_2 \cup E_3) > P(E_2) + P(E_3)`.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Answer: (C)**

      Check each statement using the confusion matrix:

      **(A) FALSE.** :math:`E_1 \cap E_3` = {true label is 4 AND predicted label is 0}.
      From the matrix, 4 images have true label 4 and were predicted as 0. So
      :math:`P(E_1 \cap E_3) = 4/600 \neq 0`. They are **not** mutually exclusive.

      **(B) FALSE.** :math:`P(E_1) = 122/600`, :math:`P(E_3) = 114/600`,
      :math:`P(E_1 \cap E_3) = 4/600`. Check:
      :math:`P(E_1)P(E_3) = (122/600)(114/600) = 13908/360000 \approx 0.0386`,
      but :math:`P(E_1 \cap E_3) = 4/600 \approx 0.0067`. Not equal.

      **(C) TRUE.** :math:`E_1` = {true label is 4} and :math:`E_2` = {true label is 1 or
      2}. An image cannot simultaneously have true label 4 and true label 1 or 2.
      Therefore :math:`E_1 \cap E_2 = \emptyset` and the events are **disjoint**.

      **(D) FALSE.** By the inclusion-exclusion principle:
      :math:`P(E_2 \cup E_3) = P(E_2) + P(E_3) - P(E_2 \cap E_3)`.
      Since :math:`P(E_2 \cap E_3) \geq 0`, we always have
      :math:`P(E_2 \cup E_3) \leq P(E_2) + P(E_3)`.

----

The following events are used in Questions 4b–4e. Kristin wants to know if the model
performs better than random guessing at classifying images of the digit three.

- :math:`T_3 = \{\text{true label is 3}\}`
- :math:`P_3 = \{\text{predicted label is 3}\}`

----

.. admonition:: Question 4b  (5 pts)
   :class: note

   What is the probability that a randomly selected image has the true label three?

   .. dropdown:: Solution
      :class-container: sd-border-success

      From the Total row, 132 images have true label 3 out of 600 total:

      .. math::

         P(T_3) = \frac{132}{600} = \boxed{0.22}.

----

.. admonition:: Question 4c  (5 pts)
   :class: note

   What is the probability that a randomly selected image is predicted to be three?

   .. dropdown:: Solution
      :class-container: sd-border-success

      From the Total column, 128 images were predicted as 3 out of 600 total:

      .. math::

         P(P_3) = \frac{128}{600} = \boxed{0.2133}.

----

.. admonition:: Question 4d  (8 pts)
   :class: note

   What is the probability that an image of digit three is correctly predicted to be three?

   .. dropdown:: Solution
      :class-container: sd-border-success

      This is the conditional probability :math:`P(P_3 \mid T_3)`. Of the 132 images with
      true label 3, the model correctly predicted 112 as 3:

      .. math::

         P(P_3 \mid T_3) = \frac{112}{132} = \boxed{0.8485}.

----

.. admonition:: Question 4e  (5 pts)
   :class: note

   Are the events :math:`T_3` and :math:`P_3` independent? State your answer and provide a
   mathematical justification.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **No, they are not independent**, as the conditional probability does not equal the
      unconditional probability:

      .. math::

         P(P_3 \mid T_3) = \frac{112}{132} = 0.8485 \neq 0.2133 = \frac{128}{600} = P(P_3).

      Since :math:`P(P_3 \mid T_3) \neq P(P_3)`, the events :math:`T_3` and :math:`P_3` are
      **not independent**.


----

Problem 5  (32 points)
------------------------------------------------------------------

.. admonition:: Problem 5 Setup
   :class: important

   Robust-ish Devices Inc. manufactures devices whose lifetimes are divided into three
   distinct phases: early failure, stable operation, and wear-out.

   - **Phase 1 (Early Failure):** During the first year (:math:`0 \leq x \leq 1`), the
     device has a constant likelihood of failing due to manufacturing defects, meaning the
     probability density function (pdf) for the device's **lifetime** is **constant** in
     this interval.

   - **Phase 2 (Stable Operation):** After surviving the early failure phase, the device
     operates reliably with virtually no chance of failure for the next 4 years
     (:math:`1 < x \leq 5`), meaning the pdf is zero during this phase, as the device is
     highly reliable.

   - **Phase 3 (Wear Out):** Beyond 5 years (:math:`x > 5`), the device enters a wear-out
     phase where the likelihood of failure increases over time. The **lifetime** is modeled
     by an exponentially decaying function, meaning the chance of the device surviving much
     longer decreases, and the risk of failure increases as the device ages.

   The probability density function for :math:`X` (the lifetime of the device) is given by
   the following piecewise function:

   .. math::

      f_X(x) = \begin{cases}
         1 - e^{-5/16}               & 0 \leq x \leq 1 \\[4pt]
         \dfrac{1}{16}\,e^{-x/16}    & x \geq 5 \\[4pt]
         0                           & \text{otherwise}
      \end{cases}

----

.. admonition:: Question 5a  (10 pts)
   :class: note

   Verify that :math:`f_X(x)` is a valid probability density function.

   .. dropdown:: Solution
      :class-container: sd-border-success

      **Axiom 1:** :math:`f_X(x) \geq 0` clearly by the graph of the pdf or because it is a
      positive constant over :math:`0 \leq x \leq 1`, an exponentially decaying function
      over :math:`x \geq 5`, and 0 everywhere else.

      **Axiom 2:**

      .. math::

         \int_{-\infty}^{\infty} f_X(x)\,dx
         = \int_0^1 \!\left(1 - e^{-5/16}\right)dx
           + \int_5^{\infty} \frac{1}{16}\,e^{-x/16}\,dx.

      .. math::

         = \left(1 - e^{-5/16}\right)
           - e^{-x/16}\Big|_5^{\infty}
         = \left(1 - e^{-5/16}\right) + e^{-5/16} = \boxed{1}. \checkmark

----

The cumulative distribution function is partially given below:

.. math::

   F_X(x) = \begin{cases}
      0                                       & x \leq 0 \\[4pt]
      \left(1 - e^{-5/16}\right) x            & 0 \leq x \leq 1 \\[4pt]
      1 - e^{-5/16}                           & 1 \leq x \leq 5 \\[4pt]
      [\text{Unknown}]                        & x \geq 5
   \end{cases}

----

.. admonition:: Question 5b  (10 pts)
   :class: note

   Determine the missing value of the cumulative distribution function (CDF) :math:`F_X(x)`,
   which is partially given above.

   .. dropdown:: Solution
      :class-container: sd-border-success

      For :math:`x \geq 5`, integrate from 5 to :math:`x` and add the accumulated area
      through the stable phase:

      .. math::

         F_X(x) = \left(1 - e^{-5/16}\right) + \int_5^x \frac{1}{16}\,e^{-t/16}\,dt.

      .. math::

         = \left(1 - e^{-5/16}\right) - e^{-t/16}\Big|_5^x.

      .. math::

         = \left(1 - e^{-5/16}\right) - e^{-x/16} + e^{-5/16} = \boxed{1 - e^{-x/16}}.

----

.. admonition:: Question 5c  (4 pts)
   :class: note

   Determine the probability that the device lasts longer than 1 year.

   .. dropdown:: Solution
      :class-container: sd-border-success

      .. math::

         P(X > 1) = 1 - F_X(1) = 1 - \left(1 - e^{-5/16}\right) = e^{-5/16} = \boxed{0.7316}.

----

.. admonition:: Question 5d  (8 pts)
   :class: note

   Find the 25th percentile for the lifetime of devices manufactured by Robust-ish Devices
   Inc..

   .. dropdown:: Solution
      :class-container: sd-border-success

      The 25th percentile :math:`x^*` satisfies :math:`F_X(x^*) = 0.25`.

      First determine which region contains :math:`x^*`. The CDF reaches
      :math:`F_X(1) = 1 - e^{-5/16} \approx 0.2684` at :math:`x = 1`, and remains at
      0.2684 until :math:`x = 5`. Since :math:`0.25 < 0.2684`, the 25th percentile falls
      in the region :math:`[0, 1)`.

      Solve :math:`F_X(x^*) = 0.25` for the region :math:`[0, 1)`:

      .. math::

         \left(1 - e^{-5/16}\right) x^* = 0.25.

      .. math::

         x^* = \frac{0.25}{1 - e^{-5/16}} \approx \frac{0.25}{0.2684} = \boxed{0.9315 \text{ years}}.

      The 25th percentile of lifetime is **0.9315 years**.