STAT 350 — Exam 2 — Spring 2026

Exam Information

Course: STAT 350 — Introduction to Statistics
Semester: Spring 2026
Total Points: 105
Time Allowed: 60 minutes

Problem	Total Possible	Topic
Problem 1 (True/False, 2 pts each)	12	Experimental Design, Power, Confidence Intervals, CI–HT Duality
Problem 2 (Multiple Choice, 3 pts each)	15	CLT, Confidence Bounds, Experimental Design, Sampling Distributions, Power
Problem 3	26	CLT, Sampling Distribution of $\bar{X}$, Conditional Probability
Problem 4	26	Confidence Bound, t- vs. z-procedures, One-Sample z-test
Problem 5	26	Two-Sample Paired t-test
Total	105

Problem 1 — True/False (12 points, 2 points each)

Question 1.1 (2 pts)

An automotive engineer is studying how tire compound (soft, medium, hard) and suspension stiffness (low, high) affect braking distance. Because vehicle weight is known to influence braking performance, the engineer groups 30 test vehicles into five blocks based on weight class (subcompact, compact, midsize, full-size, SUV). Within each weight class, the 6 vehicles are randomly assigned so that each of the six treatment combinations is applied to exactly one vehicle. The braking distance (in meters) from 100 km/h is recorded.

True or False: In this randomized block design, the blocks are defined by all combinations of tire compound, suspension stiffness, and vehicle weight class, resulting in 30 total blocks.

Question 1.2 (2 pts)

In the context of hypothesis testing and confidence intervals, let $C$ denote the confidence level of an interval, $\alpha$ denote the complementary significance level (so that $C + \alpha = 1$), and $\beta$ denote the probability of a Type II error.

True or False: Since $C + \alpha = 1$ and power $+ \beta = 1$, it follows that $C = \text{power}$.

Question 1.3 (2 pts)

An aerospace engineer constructs a confidence interval for the mean drag coefficient $\mu$ of a new wing design from a random sample of fixed size $n$.

True or False: The confidence level $C$ is one of the factors that determines the width of the confidence interval.

Question 1.4 (2 pts)

A quality engineer is planning a hypothesis test to detect whether the mean diameter of manufactured bolts has shifted from the specification value $\mu_0 = 10.00$ mm. She is evaluating the statistical power of the test at a specific alternative value $\mu_a$ that belongs to the alternative hypothesis.

True or False: The statistical power of a hypothesis test may decrease when the sample size $n$ is increased.

Question 1.5 (2 pts)

A mechanical engineer tests the fatigue life of $n = 40$ aluminum alloy specimens and constructs a 95% confidence interval for the true mean fatigue life $\mu$ (in cycles). The computed interval is $(22.5,\; 22.7)$.

True or False: It is incorrect to say that $22.5 \leq \mu \leq 22.7$ with 0.95 probability, since the inequality does not involve any random variables.

Question 1.6 (2 pts)

A researcher wishes to test $H_0\!: \mu \leq \mu_0$ versus $H_a\!: \mu > \mu_0$ at significance level $\alpha = 0.01$. Before conducting the one-sided test, she constructs a 99% two-sided confidence interval for $\mu$ from the same data and observes that $\mu_0$ falls inside the interval.

True or False: The researcher can conclude that the one-sided test at $\alpha = 0.01$ will fail to reject $H_0$.

Problem 2 — Multiple Choice (15 points, 3 points each)

Question 2.1 (3 pts)

A random sample of size $n$ is drawn from a population with mean $\mu$ and finite standard deviation $\sigma$. The population distribution is heavily right-skewed. Which of the following statements is FALSE?

(A) Regardless of sample size, \(E[\bar{X}] = \mu\).
(B) For sufficiently large \(n\), the sampling distribution of \(\bar{X}\) is approximately normal.
(C) As the sample size \(n\) becomes sufficiently large, the observations \(X_1, X_2, \ldots, X_n\) become approximately normal.
(D) The standard deviation of \(\bar{X}\) decreases as \(n\) increases, regardless of whether \(\bar{X}\) is approximately normal.
(E) A larger sample size \(n\) is needed for the CLT to provide a good approximation here than would be needed if the population were only mildly skewed.

Question 2.2 (3 pts)

A Purdue scouting intern for the Indianapolis Colts is evaluating 40-yard dash times from $n = 9$ prospective NFL running backs at the 2026 combine. Historical records indicate that 40-yard dash times for running backs are normally distributed with a known population standard deviation of $\sigma = 0.0853$ seconds. The intern’s sample yields $\bar{x} = 4.4639$ seconds. To establish the slowest acceptable mean sprint time for recruitment, the intern constructs a one-sided upper confidence bound at $\alpha = 0.03$. Which R expression gives the correct critical value?

(A) qnorm(0.03, lower.tail = FALSE)
(B) qt(0.03, df = 8, lower.tail = FALSE)
(C) qnorm(0.03/2, lower.tail = FALSE)
(D) qt(0.03/2, df = 8, lower.tail = FALSE)
(E) None of the above

Question 2.3 (3 pts)

A biomedical engineer wants to compare three physical therapy protocols for post-surgical knee recovery. She recruits 60 patients and knows from prior studies that age strongly affects recovery outcomes. She divides patients into four age groups (18–30, 31–45, 46–60, 61+), and within each age group, randomly assigns an equal number of patients to each of the three protocols. Recovery is measured by range of motion (in degrees) at 8 weeks. Which of the following statements is FALSE?

(A) The experimental design is a Randomized Block Design, with age group as the blocking variable.
(B) There are three treatments and four blocks in this experiment.
(C) The purpose of blocking by age is to reduce the variability arising from age differences, making it easier to detect treatment effects.
(D) The blocking ensures that the randomization of patients to protocols is no longer necessary within each age group.
(E) If age had no effect on recovery, a Completely Randomized Design would have been equally effective.

Question 2.4 (3 pts)

The table below summarizes the properties of two independent populations A and B.

	Population A	Population B
Distribution family	Normal	Normal
Mean	$\mu_A = \mu_B = \mu$	$\mu_A = \mu_B = \mu$
Standard Deviation	$\sigma_A = 4.5$	$\sigma_B = 3.9$
Sample size	$n_A = 44$	$n_B = 52$

Which of the following statements about the sampling distributions of $\bar{X}_A$ and $\bar{X}_B$ is FALSE?

(A) \(P(\bar{X}_A = 4.5) = P(\bar{X}_B = 3.9)\)
(B) The pdf of \(\bar{X}_A\) and \(\bar{X}_B\) have the same value when evaluated at \(\mu\). That is, \(f_{\bar{X}_A}(\mu) = f_{\bar{X}_B}(\mu)\).
(C) \(P(\bar{X}_A \leq \mu - 1) > P(\bar{X}_B \leq \mu - 1)\)
(D) \(P\!\left(\bar{X}_A > \mu + \frac{4.5}{\sqrt{44}}\right) = P\!\left(\bar{X}_B > \mu + \frac{3.9}{\sqrt{52}}\right)\)

Question 2.5 (3 pts)

An industrial engineer tests whether a process improvement has increased the mean production rate above the current standard of $\mu_0 = 200$ units/hour. The population standard deviation is known to be $\sigma = 18$ units/hour. She collects a sample of $n = 36$ observations and conducts an upper-tailed $z$-test at $\alpha = 0.05$. The engineer wants to know the probability that this test will correctly detect an increase if the true mean has shifted to $\mu_a = 206$ units/hour.

The following R outputs are provided:

> qnorm(0.05, lower.tail = FALSE)
[1] 1.644854
> qnorm(0.025, lower.tail = FALSE)
[1] 1.959964

(A) pnorm(204.9346, mean = 206, sd = 3, lower.tail = FALSE)
(B) pnorm(204.9346, mean = 206, sd = 18, lower.tail = FALSE)
(C) pnorm(205.8799, mean = 206, sd = 3, lower.tail = FALSE)
(D) pnorm(204.9346, mean = 200, sd = 3, lower.tail = FALSE)
(E) pnorm(204.9346, mean = 206, sd = 3, lower.tail = TRUE)

Problem 3 (26 points) — AirBuds Battery Life

Problem 3 Setup

PineApple is assessing the battery life consistency of their next-generation AirBuds. It is known that the battery life of a new left AirBud follows a fairly symmetric distribution with a true mean operation time of $\mu_L = 540$ minutes and a standard deviation of $\sigma_L = 81$ minutes.

The company will randomly select a batch of $n_L = 53$ left AirBuds from their manufacturing lines and record their battery lifetimes while playing the same audio on repeat. Let $\bar{X}_L$ denote the random variable representing the average battery life of such a randomly selected batch.

Question 3a (5 pts)

State the approximate distribution of $\bar{X}_L$. Include the name of its distribution family and its parameters (mean and standard error) in both symbolic and numerical forms.

Question 3b (3 pts)

What important theorem justifies your approximation in Part (a)?

R Output for Parts (c) and (d)

The following R calculations are provided. Default behavior is lower.tail = TRUE.

> pnorm(-2.2469)
[1] 0.0123
> pnorm(-2.1424)
[1] 0.0161
> pnorm(-1.8363)
[1] 0.0332
> pnorm(-1.7976)
[1] 0.0361
> pnorm(0.6121)
[1] 0.7298
> pnorm(0.8988)
[1] 0.8157

Question 3c (6 pts)

What is the probability that the sample mean lifetime falls between 520 and 550 minutes?

Question 3d (12 pts)

Suppose the testing software is programmed to automatically discard any batch of 53 AirBuds if their sample mean is lower than 515 minutes, thus a batch is successfully retained and recorded if their sample mean is at least 515 minutes. Given that a batch is successfully recorded, what is the probability that its sample mean falls between 520 and 550 minutes?

Problem 4 (26 points) — eBay Broken Laser Pointer

Problem 4 Setup

In an effort to consolidate the conceptual understanding of interval estimation (confidence intervals and bounds) in STAT 350, an interactive trivia-based classroom activity was implemented during the Fall semester. Students were asked to estimate the sale price (in US dollars) of the first item ever sold on eBay, which was a broken laser pointer. A random sample of $n = 16$ student point estimates yielded a sample mean of $\bar{x} = 13.22$ and a sample standard deviation of $s = 4.3$. Typically, this type of guessing data tends to be mildly skewed due to psychological anchoring in human decision-making.

Question 4a (2 pts)

Does the information provided above justify the construction of standard confidence intervals or bounds? Give one statistical justification for your answer.

Question 4b Setup

Suppose we assume the conditions are met to conduct statistical inference. The actual selling price of the broken laser pointer was $14.83. Suspecting that students tend to underestimate, the instructor would like to establish, with 90% confidence, a confidence upper bound for the true mean of all student guesses across STAT 350 during the fall semester.

Question 4b.i (3 pts)

Which R output below provides the correct critical value for this computation?

(A) qnorm(0.05, lower.tail = FALSE) → 1.644854
(B) qt(0.1, df = 15, lower.tail = FALSE) → 1.340606
(C) qnorm(0.1, lower.tail = FALSE) → 1.281552
(D) qt(0.05, df = 15, lower.tail = FALSE) → 1.75305

Question 4b.ii (5 pts)

Compute and interpret the 90% confidence upper bound in the context of the problem.

Question 4c (2 pts)

In general, $t$-intervals/bounds are [______] their corresponding $z$-intervals/bounds (assuming the same confidence level, standard error, and sample size). Select the correct option.

(A) narrower than
(B) wider than
(C) the same width as

Question 4d Setup

Recently, a digital marketing analyst claimed that the true mean price for vintage RadioShack laser pointers is $225. The prices of these vintage pointers are known to be normally distributed with a population variance of $\sigma^2 = 64$. A Purdue engineering historian believes the true mean price is different. He collects a random sample of $n = 12$ prices from e-commerce platforms, yielding a sample mean of $\bar{x} = 210$ and a sample standard deviation of $s = 15$. Use a significance level $\alpha = 0.02$ for all inference calculations below.

Question 4d.i (4 pts)

Provide the first two steps of the four-step hypothesis testing procedure.

Question 4d.ii (3 pts)

Compute the appropriate test statistic.

Question 4d.iii (3 pts)

Which of the following R outputs is appropriate for computing the $p$-value for this test?

(A) 2*pt(abs(t_ts), df=11, lower.tail=FALSE) → 0.005294732
(B) 2*pnorm(abs(z_ts), lower.tail=FALSE) → 8.292839e-11
(C) pnorm(abs(z_ts), lower.tail=FALSE) → 4.14642e-11
(D) pt(abs(t_ts), df=11, lower.tail=FALSE) → 0.002647366

Question 4d.iv (4 pts)

State your decision and provide a conclusion in the context of the problem.

Problem 5 (26 points) — Soybean Yield: Soil Amendments

Problem 5 Setup

Researchers at a Purdue agricultural extension are evaluating two soil amendments, the standard treatment (S) and a new organic treatment (N), to determine whether the new treatment increases soybean yield (measured in bushels per acre). They selected 18 farm plots in Tippecanoe County. Each plot is split in half: one half receives Amendment S and the other half receives Amendment N. After the growing season, soybean yield is recorded for each half-plot.

The researchers calculate the difference for each plot: $D = N - S$. They have verified that the distribution of these differences is approximately normal.

	Amendment N	Amendment S	$D = N - S$
$n$	18	18	18
Sample Mean	54.8	51.2	3.6
Sample Std Dev	7.3	6.9	5.4

Question 5a (2 pts)

Which testing procedure is appropriate for this experiment?

(A) Two-sample independent $t$-test

(B) Two-sample paired $t$-test

Question 5b (6 pts)

Explain what specific characteristic(s) in the experimental design motivated your choice of testing procedure in part (a).

Question 5c (4 pts)

Provide the first two steps of the four-step hypothesis testing procedure. Use a significance level of $\alpha = 0.03$.

Question 5d (6 pts)

Calculate the test statistic for this experiment. Show your work.

Question 5e (3 pts)

Select the most appropriate R command to compute the $p$-value for this specific test.

(A) pt(test_statistic, df = 17, lower.tail = TRUE)
(B) 2*pt(abs(test_statistic), df = 17, lower.tail = FALSE)
(C) pt(test_statistic, df = 17, lower.tail = FALSE)
(D) pt(test_statistic, df = 34, lower.tail = FALSE)
(E) pt(test_statistic, df = 30.27, lower.tail = FALSE)
(F) pnorm(test_statistic, lower.tail = FALSE)

Question 5f (5 pts)

The $p$-value for the correct test was found to be 0.0058. Using a significance level of $\alpha = 0.03$, state your formal decision and write a conclusion in the context of the problem.