Exam 1 — Fall 2025: Fully Worked Solutions
The questions below reproduce the Fall 2025 Exam 1 in full accessible text. Each problem is followed by a complete worked solution. Point values reflect the actual exam.
Section |
Format |
Points |
|---|---|---|
Problem 1 — True/False |
6 questions × 2 pts |
12 |
Problem 2 — Multiple Choice |
5 questions × 3 pts |
15 |
Problem 3 — Free Response |
4 parts |
23 |
Problem 4 — Free Response |
5 parts |
27 |
Problem 5 — Free Response |
4 parts |
28 |
Total |
105 |
Problem 1: True/False (12 points, 2 points each)
Indicate the correct answer by completely filling in the appropriate circle. If you indicate your answer by any other way, you may be marked incorrect.
Question 1.1 (2 pts)
Let \(X\) and \(Y\) be two discrete random variables with supports \(x \in \{0, 1, 2\}\) and \(y \in \{1, 2, 3\}\).
T or F: If \(P(X=1,\; Y=1) = P(X=1)\cdot P(Y=1)\), then this implies that \(X\) and \(Y\) are independent random variables.
Solution
Answer: FALSE
Independence requires that the product rule holds for every pair of values in the joint support:
The condition given — \(P(X=1, Y=1) = P(X=1)\cdot P(Y=1)\) — verifies the product rule for only one of the \(3 \times 3 = 9\) required pairs. All nine must hold simultaneously for independence to be established. Satisfying just one tells us nothing about the remaining eight pairs.
The statement is FALSE.
Question 1.2 (2 pts)
For three events \(A\), \(B\), and \(C\) from the same sample space \(\Omega\), it is known that \(C \neq \emptyset\) and \(P(A \cap B) > 0\).
T or F: If \(P(A \cap B \cap C) = 0\), it must follow that \(P(A \cap C) = P(B \cap C) = 0\).
Solution
Answer: FALSE
Just because no outcome belongs to all three sets simultaneously does not mean that \(A\) and \(C\), or \(B\) and \(C\), cannot overlap pairwise.
Counterexample: Let \(\Omega = \{1, 2, 3, 4\}\) with each outcome equally likely (probability 1/4). Define:
Check the conditions:
But:
So \(P(A \cap B \cap C) = 0\) does not force \(P(A \cap C) = 0\). The statement is FALSE.
Question 1.3 (2 pts)
Let \(X\) be a continuous random variable with a PDF \(f_X(x)\) defined over the symmetric interval \([-c, c]\) for some constant \(c > 0\).
T or F: If the PDF is an even function (meaning \(f_X(-x) = f_X(x)\) for all \(x\)), then this implies that the 50th percentile (median) of the distribution must be 0.
Solution
Answer: TRUE
If \(f_X\) is an even function on \([-c, c]\), the distribution is symmetric about zero. Evaluate the CDF at \(x = 0\):
By the even-function property \(f_X(-x) = f_X(x)\), the substitution \(u = -x\) gives:
Since the total area equals 1:
Because \(F_X(0) = 0.5\), the median is 0. The statement is TRUE.
Question 1.4 (2 pts)
A censor reports a raw error measurement \(X\) in millivolts. Historical data show a bias in these measurements with \(E[X] = -2\) (millivolts) and \(\text{SD}(X) = 5\) (millivolts).
T or F: For the transformed score \(Y = 5X + 35\), it follows that \(E[Y] = \text{SD}(Y)\).
Solution
Answer: TRUE
Apply the linear transformation rules.
Expected value:
Standard deviation:
Since \(E[Y] = 25 = \text{SD}(Y)\), the statement is TRUE.
Question 1.5 (2 pts)
In Normal distribution word problems, we distinguish “forward” and “backward” problems.
T or F: “Backward” problems solve for an \(x\)-value given a probability, while “forward” problems solve for a probability given an \(x\)-value.
Solution
Answer: TRUE
This is the standard terminology used in Normal distribution calculations:
A forward problem starts with an \(x\)-value (or range of \(x\)-values) and asks for a probability. The workflow is: standardize to obtain a \(z\)-score, then look up the probability in the \(z\)-table.
A backward problem starts with a probability (a percentile) and asks for the corresponding \(x\)-value. The workflow is: find the \(z\)-score from the \(z\)-table, then reverse-transform: \(x = \mu + z\sigma\).
The statement is TRUE.
Question 1.6 (2 pts)
In a class of 60 students, the instructor randomly selects 20 homework submissions without replacement to audit for possible AI policy violations. Each audited paper is labeled probable violation or no obvious violation. Let \(X\) be the number of audited papers with probable violations.
T or F: Then \(X\) is a Binomial random variable because the process satisfies the BINS conditions.
Solution
Answer: FALSE
Recall the four BINS conditions required for a Binomial distribution:
B — Binary outcomes (probable violation / no obvious violation ✓)
I — Independent trials
N — Fixed number of trials (n = 20 ✓)
S — Same probability of success on every trial
Sampling without replacement from a finite population of 60 violates both the I and S conditions. After each paper is selected, the composition of the remaining pool changes, so the probability of drawing a violation changes from draw to draw, and the draws are not independent.
Since the I (independent trials) and S (same probability) conditions are both violated, \(X\) is not a Binomial random variable. The statement is FALSE.
Problem 2: Multiple Choice (15 points, 3 points each)
Indicate the correct answer by completely filling in the appropriate circle. If you indicate your answer by any other way, you may be marked incorrect. For each question, there is only one correct option letter choice unless specified.
Question 2.1 (3 pts)
A researcher randomly selected 100 graduate students and surveyed their daily expenses on eating out. The collected data are visualized in the boxplot below.
The researcher received an email from one of the participants, stating that they had misreported the amount, and the corrected amount should be 27.24 instead of 37.24. After this correction, which of the two graphical components of the boxplot remain unchanged?
(A) Third Quartile, \(Q_3\)
(B) Sample Mean, \(\bar{x}\)
(C) Upper Whisker
(D) The number of explicit points
(E) The maximum value
Solution
Answer: (A) and (C)
The key question is whether 27.24 remains an outlier after the correction.
From the boxplot, approximate values are: \(Q_1 \approx 3\), \(Q_3 \approx 15\), giving \(IQR \approx 12\).
Since \(37.24 > 33\), the original value was an outlier (explicit point). ✓
After correction: \(27.24 < 33\), so 27.24 is no longer an outlier — it falls within the upper fence and is absorbed into the regular data.
Now evaluate each component:
(A) Q₃: \(Q_3\) is determined by the middle 50% of the data. The corrected value 37.24 → 27.24 was an upper outlier, far above \(Q_3 \approx 15\). Moving it to 27.24 still leaves it above \(Q_3\), so \(Q_3\) is unchanged. ✓
(B) Sample Mean: The mean decreases by \((37.24 - 27.24)/100 = 0.10\). Changes. ✗
(C) Upper Whisker: The upper whisker endpoint is the largest observation that falls at or below the upper inner fence (the largest non-outlier). Originally 37.24 was the sole outlier, so the whisker endpoint was the next-largest value (\(\approx 31\)). After the correction, 27.24 is no longer an outlier, but \(27.24 < 31\), so the largest non-outlier is still \(\approx 31\). The upper whisker is unchanged. ✓
(D) Number of explicit points: Before: 1 outlier (37.24). After: 0 outliers (27.24 is now within the fence). Changes. ✗
(E) Maximum value: The overall maximum was 37.24; it is now 27.24 (or whichever value was previously the upper whisker endpoint, now the new maximum). Changes. ✗
The two unchanged components are (A) Q₃ and (C) Upper Whisker.
Question 2.2 (3 pts)
At a large clinic, women’s heights are well modeled by a Normal distribution with mean 165 cm and standard deviation 7 cm. Identify the false statement from A–E.
(A) The mean, median, and mode are all 165 cm.
(B) The 25th and 75th percentiles are equidistant from 165 cm.
(C) Converting heights from centimeters to inches does not change anyone’s z-score or percentile.
(D) About 68% of women have heights within 7 cm of 165 cm.
(E) Every woman’s height lies within 3 standard deviations of 165 cm.
Solution
Answer: (E)
Evaluate each statement:
(A) TRUE. For any symmetric, unimodal distribution — and the Normal distribution is both — the mean, median, and mode coincide at the center \(\mu = 165\) cm.
(B) TRUE. By symmetry of the Normal distribution about \(\mu = 165\) cm, \(Q_1 = 165 - z_{0.75} \times 7\) and \(Q_3 = 165 + z_{0.75} \times 7\). From the z-table, \(z_{0.75} = 0.67\), so both quartiles are exactly \(0.67 \times 7 = 4.69\) cm from 165 cm — equidistant.
(C) TRUE. A linear unit conversion \(h_{\text{in}} = h_{\text{cm}} / 2.54\) scales both the observation and the mean by the same factor, leaving \(z = (h - \mu)/\sigma\) unchanged.
(D) TRUE. This is the 68% part of the empirical rule: approximately 68% of observations from a Normal distribution fall within \(\pm 1\) standard deviation of the mean, i.e., within 7 cm of 165 cm.
(E) FALSE. The Normal distribution has infinite support — it extends from \(-\infty\) to \(+\infty\). While approximately 99.7% of women have heights within 3 standard deviations of 165 cm (i.e., between 144 cm and 186 cm), there is no finite bound beyond which the probability is exactly zero. It is theoretically possible (with very small probability) for an observation to fall more than 3 SDs from the mean. The statement is FALSE.
Question 2.3 (3 pts)
A clinical trial enrolls 10 women, each receiving one dose of a new drug. Based on a pre-measured genotype, 4 participants have a 30% chance of response (type \(G_1\)) and 6 have a 70% chance (type \(G_2\)). Responses are independent across participants, and each participant’s outcome is either response or no response. Let \(X\) be the number who respond. Identify the correct statement from A–E.
(A) The number of trials is not a fixed constant.
(B) The trials are dependent.
(C) There are more than two possible outcomes on each trial.
(D) The probability of success is not the same for all trials.
(E) The trials are conducted without replacement from a small finite population.
Solution
Answer: (D)
This question tests the BINS conditions for a Binomial distribution. Only one of the four conditions fails here, and the correct answer identifies which.
(A) FALSE. The number of trials is fixed: \(n = 10\) women, each tested exactly once.
(B) FALSE. The problem explicitly states that responses are independent across participants.
(C) FALSE. Each outcome is binary: response or no response. There are exactly two possible outcomes per trial.
(D) TRUE. The 4 type-\(G_1\) participants each have probability 0.30 of responding, while the 6 type-\(G_2\) participants each have probability 0.70. Since \(0.30 \neq 0.70\), the probability of success varies across trials, violating the S (Same probability) condition of BINS. \(X\) is therefore not a Binomial random variable.
(E) FALSE. The participants are distinct individuals, not drawn from a shared pool without replacement; each is tested exactly once.
Question 2.4 (3 pts)
A new type of biodegradable plastic is developed, and its degradation time \(X\) (in years) is modeled by the probability density function
What is the expected lifetime of this plastic?
(A) 1 year
(B) 1.5 years
(C) 2.25 years
(D) 3 years
(E) 3.25 years
(F) 9 years
Solution
Answer: (C)
Step 1: Find k.
The PDF must integrate to 1:
Step 2: Compute \(E[X]\).
The answer is (C).
Question 2.5 (3 pts)
Let \(X \sim \text{Binomial}(n=2,\; p)\) with unknown \(p \in (0,1)\). Define a second random variable \(Y\) conditionally on \(X\) as follows:
\(P(Y = 0 \mid X = 0) = 1\)
\(P(Y = 1 \mid X = 1) = P(Y = 2 \mid X = 1) = \dfrac{1}{2}\)
\(P(Y = y \mid X = 2) \sim \text{Poisson}(\lambda)\)
Which expression equals \(P(Y = 0)\)?
(A) \((1-p)^2 + p^2 \cdot e^{-\lambda}\)
(B) \((1-p)^2 \cdot e^{-\lambda} + p^2\)
(C) \((1-p)^2 + 2p(1-p) + p^2 \cdot e^{-\lambda}\)
(D) \((1-p) \cdot p \cdot e^{-\lambda}\)
Solution
Answer: (A)
Apply the Law of Total Probability, conditioning on \(X\):
Marginal probabilities of X (Binomial with n=2):
Conditional probabilities of Y=0:
\(P(Y=0 \mid X=0) = 1\) (given directly).
\(P(Y=0 \mid X=1) = 0\), since given \(X=1\), \(Y\) takes values 1 or 2 each with probability \(\tfrac{1}{2}\) — it cannot equal 0.
\(P(Y=0 \mid X=2)\): given \(X=2\), \(Y \sim \text{Poisson}(\lambda)\), so \(P(Y=0 \mid X=2) = e^{-\lambda}\).
Combining:
The answer is (A).
Free Response Questions 3–5
Show all work, clearly label your answers, and use four decimal places.
Problem 3 (23 points)
Problem 3 Setup
Assume men’s college basketball game lengths \(X\) (minutes) are Normally distributed with mean 118 and standard deviation 10, where these values already account for fouls, timeouts, media breaks, and overtime.
Question 3a (6 pts)
Find the probability that a randomly selected game lasts within 1.5 standard deviations of the mean.
Solution
\(X \sim N(\mu = 118,\;\sigma = 10)\). Standardize directly:
Using symmetry of the standard normal:
Question 3b (10 pts)
Compute the interquartile range (IQR) for the population of men’s college basketball game lengths.
Solution
The IQR requires finding \(Q_1\) (25th percentile) and \(Q_3\) (75th percentile).
Step 1: Find the standard normal percentiles from the z-table.
For the standard normal, the 25th and 75th percentiles are symmetric about 0. From the z-table:
Step 2: Transform to the basketball game length distribution.
Step 3: Compute the IQR.
Question 3c (4 pts)
A random sample of 10 game times is given below. Compute the population-level inner fences and determine if any of these points fall outside the 1.5 IQR rule.
Solution
The population-level inner fences use the population \(Q_1\) and \(Q_3\) from Question 3b, not the sample quartiles.
Inner fences:
Check each observation:
Observation |
Value (min) |
Status |
|---|---|---|
1 |
88 |
\(88 < 91.2\) → outside lower fence ✗ |
2 |
89 |
\(89 < 91.2\) → outside lower fence ✗ |
3–10 |
108–134 |
All between 91.2 and 144.8 → within fences ✓ |
Two points fall outside the inner lower fence: 88 and 89.
Question 3d (3 pts)
Consider three distribution models used in this course: \(\text{Normal}(\mu, \sigma^2)\), \(\text{Exponential}(\lambda)\), and \(\text{Uniform}(a, b)\). The interquartile range (\(IQR = Q_3 - Q_1\)) measures the spread of the middle 50% of the distribution. In the statements below, “does not depend on the mean” means the IQR cannot be determined from the mean (\(E[X]\)) alone; “constant multiple of the mean” means \(IQR = k \cdot E[X]\) for some constant \(k\) that does not vary. (If needed use scratch space on pg 2 of exam.)
Which statement about the \(IQR\) and the mean (\(E[X]\)) is incorrect?
(A) For an Exponential distribution, the IQR is a constant multiple of the mean.
(B) For a Normal distribution, the IQR does not depend on the mean.
(C) For a Uniform distribution on \([0, b]\), the IQR equals the mean.
(D) For a Uniform distribution on \([a, b]\), the IQR is determined by the mean alone.
Solution
Answer: (D)
Verify each statement:
(A) TRUE. For \(\text{Exponential}(\lambda)\), the quartiles are obtained from the CDF \(F(t) = 1 - e^{-\lambda t}\):
Since \(E[X] = 1/\lambda\):
TRUE — IQR is a constant multiple (\(\ln 3\)) of the mean.
(B) TRUE. For \(\text{Normal}(\mu, \sigma^2)\):
This depends only on \(\sigma\), not on \(\mu = E[X]\). Knowing the mean tells you nothing about the IQR. TRUE.
(C) TRUE. For \(\text{Uniform}(0, b)\):
Therefore \(IQR = E[X]\). TRUE.
(D) FALSE (the incorrect statement). For a general \(\text{Uniform}(a, b)\):
These are two independent quantities. For example, \(\text{Uniform}(0, 4)\) and \(\text{Uniform}(1, 3)\) both have \(E[X] = 2\), but their IQRs are 2 and 1 respectively. Knowing \(E[X]\) alone does not determine the IQR for a general \(\text{Uniform}(a, b)\). The statement is FALSE.
Problem 4 (27 points)
Problem 4 Setup
Heekyung is training her cat, Meredith, to give a high-five. From her extensive experience, Meredith’s response depends on whether a treat is offered. Heekyung offers a treat 70% of the time. Meredith’s behavior on an attempt is exactly one of the following: high-five, ignore, or nag.
If a treat is offered:
Meredith gives a high-five with probability 0.8.
Meredith ignores Heekyung with probability 0.2.
If a treat is not offered:
Meredith gives a high-five with probability 0.35.
Meredith ignores Heekyung with probability 0.6.
Otherwise, Meredith Nags Heekyung.
Heekyung will restart the training next month if the probability that no treat was offered, given that a high-five occurred, is greater than one-half.
Question 4a (3 pts)
Are the two events {Treat is offered} and {Give a high-five} independent? State yes or no and provide mathematical justification.
Solution
No — the events are not independent.
that is, \(0.8 \neq 0.35\), which in turn means the probability that Meredith gives a high five is dependent on whether Meredith is offered a treat or not.
Question 4b (2 pts)
For each branch A, B, and C in the tree diagram below, write the probability statement and find its probability.
Solution
Branch A:
Branch B:
Branch C:
Question 4c (4 pts)
Determine the probability that Meredith Nags Heekyung.
Solution
Apply the Law of Total Probability:
Question 4d (8 pts)
Determine the probability that Meredith gives a high-five.
Solution
Apply the Law of Total Probability:
Question 4e (10 pts)
Based on your results, compute the probability that no treat was offered, given that a high-five occurred. According to these calculations and Heekyung’s initial no retraining requirements stated at the start of this question, should Heekyung restart the training next month?
Solution
Apply Bayes’ Rule:
Numerator:
Denominator (from Question 4d):
Result:
Decision: Do not restart training, as the probability is less than 0.5.
Grading note from instructors
The solution key notes: “Did not grade this part as we should have said will not restart training if probability is greater than 0.5.” The condition as stated in the problem (restart if \(P > 0.5\)) is confirmed as the intended interpretation. Since \(0.1579 < 0.5\), Heekyung does not restart training.
Problem 5 (28 points)
Problem 5 Setup
Purdue analyzed how long IT tickets take to resolve. Routine requests are handled in a quick “triage window” and are about equally likely to finish at any time during the first 10 minutes. If a ticket survives past 10 minutes, the remaining time decays exponentially. Let \(T\) be the resolution time (minutes) with PDF:
where \(k > 0\) is to be determined.
Question 5a (10 pts)
Determine \(k\) so that \(f_T(t)\) is a valid probability density function.
Solution
\(k > 0\), otherwise \(f_T(t) < 0\).
Solve for \(k\) such that \(\displaystyle\int_{-\infty}^{\infty} f_T(t)\,dt = 1\):
The cumulative distribution function for the resolution time \(T\) is given below, defined up to the normalizing constant \(k\) determined in part (a):
Question 5b (6 pts)
What is the probability that a ticket is finished between 5 and 15 minutes? For your convenience, the CDF of the resolution time \(T\) is given above, defined up to the normalizing constant \(k\) determined in part (a).
Solution
Substituting \(k = 1/11\) into the given CDF:
F_T(5): Since \(5 \in [0, 10]\), use \(F_T(t) = k \cdot t\):
F_T(15): Since \(15 > 10\), use \(F_T(t) = 1 - k\,e^{-(t-10)}\):
Result:
Question 5c (6 pts)
Given that a ticket has been in the system for at least 5 minutes, determine the probability that the total time the ticket remains unresolved is at least 15 minutes.
Solution
We want \(P(T \geq 15 \mid T \geq 5)\). Apply the definition of conditional probability:
P(T ≥ 5):
P(T ≥ 15):
Result:
Note
This is not the memoryless property (which would apply only in the purely exponential region). Here \(T \geq 5\) includes the uniform region, so the conditional probability must be computed directly via the CDF.
Question 5d (6 pts)
Only 5% of tickets take longer than \(t^*\) to solve. Determine \(t^*\).
Solution
Solve for \(t^*\) such that \(P(T > t^*) = 0.05\). \(t^*\) must fall into region \(t > 10\) because \(F_T(10) = \dfrac{10}{11}\), which is less than 0.95.