13.3. Model Diagnostics and Statistical Inference
Having developed the simple linear regression model and methods for fitting it using least squares, we now face a critical question: how do we know if our model is appropriate for the data? Before conducting any statistical inference—hypothesis tests, confidence intervals, or predictions—we must first verify that our model assumptions are reasonable. Violating these assumptions can lead to invalid conclusions and unreliable inference procedures.
This chapter combines three essential components of regression analysis: diagnostic procedures for checking model assumptions, the F-test for overall model utility, and inference procedures for individual model parameters. Together, these tools provide a complete framework for validating and drawing conclusions from simple linear regression models.
Road Map 🧭
Problem we will solve – How to verify that regression model assumptions are satisfied, test whether our model provides useful information about the relationship between variables, and conduct formal inference about model parameters with appropriate uncertainty quantification
Tools we’ll learn – Residual plots and diagnostic graphics for assumption checking, F-test for overall model utility, t-tests and confidence intervals for slope and intercept parameters, and the mathematical relationship between different inference approaches
How it fits – This completes our regression toolkit by ensuring model validity before inference, testing overall model usefulness, and providing methods for parameter-specific conclusions—preparing us for prediction and more advanced regression techniques
13.3.1. The Critical Importance of Assumption Checking
Before conducting any statistical inference procedures, we must verify that our model assumptions are reasonable. If we have strong violations of these assumptions, our statistical inference procedures will not be accurate—we won’t be able to trust the results and won’t be able to convey the information we want about the relationship between our variables.
Review of Simple Linear Regression Assumptions

Fig. 13.31 The four fundamental assumptions of simple linear regression that must be verified before conducting inference
Our simple linear regression model \(Y_i = \beta_0 + \beta_1 X_i + \varepsilon_i\) requires four key assumptions:
Assumption 1: Independence and Identical Distribution (IID)
The observed pairs \((x_i, y_i)\) for \(i \in \{1, 2, \ldots, n\}\) are such that \(y_i\) represents a simple random sample for each fixed \(x_i\). This means:
We plan in advance which explanatory variable values \(x_1, x_2, \ldots, x_n\) to collect
We then measure the response as output for each fixed \(x_i\)
Each response \(y_i\) is independent of the others
The responses constitute a simple random sample for their respective \(x\) values
This assumption is primarily ensured through proper experimental design and data collection procedures. It’s difficult to verify statistically after data collection, so we must rely on understanding how the data was gathered.
Assumption 2: Linearity
The association between the explanatory variable and the response is, on average, linear. The mean response follows the straight line \(E[Y|X] = \beta_0 + \beta_1 X\). If this assumption is violated, using a linear model to describe a non-linear relationship will lead to poor fits and misleading conclusions.
Assumption 3: Normality
The error terms (and hence the response values) are normally distributed:
This leads to the conditional distribution:
Assumption 4: Homoscedasticity (Equal Variance)
The error terms have constant variance \(\sigma^2\) across all values of \(X\). The spread of \(Y\) values around the regression line remains the same regardless of the \(X\) value. Violations of this assumption are called heteroscedasticity.
13.3.2. Diagnostic Tools: Scatter Plots and Residual Plots

Fig. 13.32 Using scatter plots to check linearity and constant variance assumptions
Scatter Plots for Initial Assessment
Scatter plots serve as our primary tool for initial assumption checking:
Linearity: Points should roughly follow a straight line pattern
Constant variance: The spread of points around the apparent trend should remain consistent across the range of \(X\) values
Outliers: Identify observations that don’t fit the general pattern
However, scatter plots cannot help us assess the normality assumption—we need additional tools for that.
The Power of Residual Plots

Fig. 13.33 Construction of residual plots from scatter plots, showing how residuals transform the regression analysis
Residual plots provide a more sensitive diagnostic tool than scatter plots alone. A residual plot displays the residuals \(e_i = y_i - \hat{y}_i\) on the vertical axis against the explanatory variable \(x_i\) on the horizontal axis.
Construction Process:
Fit the regression line to obtain predicted values \(\hat{y}_i\)
Calculate residuals: \(e_i = y_i - \hat{y}_i\) for each observation
Plot residuals (vertical axis) against \(x_i\) values (horizontal axis)
What Residual Plots Reveal:
The residual plot essentially rotates the regression line to be horizontal around the x-axis and amplifies deviations from the fitted relationship. This makes patterns easier to detect than in the original scatter plot.
Applied Example: Blood Pressure Study Revisited

Fig. 13.34 Diagnostic analysis of the blood pressure treatment study showing scatter plot with fitted line
Let’s apply our diagnostic procedures to the blood pressure treatment study, where we examined the relationship between patient age and change in blood pressure after 24 hours of treatment.
Scatter Plot Assessment:
Looking at the scatter plot with the fitted line \(\hat{y} = 20.12 - 0.5263x\), we can trace our finger across the plot to assess the spread of points around the line. The linear relationship appears reasonable, though with only 11 observations, it’s challenging to definitively assess the constant variance assumption.
Residual Plot Analysis:

Fig. 13.35 Residual plot for the blood pressure data showing individual residuals labeled by vehicle type
The residual plot for the car efficiency data shows each vehicle’s residual clearly labeled. When we examine the spread across different cylinder volume ranges:
Left region (around 1.5L): Limited observations make assessment difficult
Middle region (around 1.8-2.0L): Several observations with varied residuals
Right region (around 2.5L): Adequate spread above and below zero
The residual plot suggests potential minor violations of the constant variance assumption, but nothing strong enough to invalidate our analysis given the small sample size.
13.3.3. Recognizing Assumption Violations
Understanding what various patterns in residual plots indicate is crucial for proper model assessment.
Constant Variance Violations (Heteroscedasticity)

Fig. 13.36 Common patterns indicating violations of the constant variance assumption
Cone Pattern: As \(X\) increases, the residual errors become larger. This indicates that \(\text{Var}(\varepsilon_i)\) is an increasing function of \(X\).
Hourglass Pattern: For extreme values of \(X\) (both large and small), the spread is larger than in the middle range. Variance depends on \(X\) in a non-constant way.
Reverse Cone Pattern: As \(X\) increases, the residual errors become smaller. Again, variance is a function of \(X\) rather than constant.
These patterns indicate strong violations of the equal variance assumption, requiring more advanced techniques like weighted regression (beyond this course’s scope).
Linearity Violations

Fig. 13.37 Residual plot patterns indicating violations of the linearity assumption
What We Want to See: Random scatter of points around the horizontal line at zero, with no discernible pattern. This indicates both linearity and constant variance assumptions are satisfied.
Curved Patterns: If residuals show systematic curved patterns, this suggests the true population relationship is non-linear. For example, if the true model should be \(Y = \beta_0 + \beta_1 X + \beta_2 X^2 + \varepsilon\) but we fit only \(Y = \beta_0 + \beta_1 X + \varepsilon\), the quadratic component gets absorbed into the residuals, creating a curved pattern.
Key Insight: Patterns in residual plots indicate that our linear model is missing important systematic relationships that exist in the data.
Assessing Normality of Residuals

Fig. 13.38 Methods for assessing normality of residuals using histograms and QQ plots
For normality assessment, we use the same tools we’ve employed throughout the course, but applied to the residuals \(e_1, e_2, \ldots, e_n\):
Histogram of Residuals: Should approximate a normal distribution shape, centered around zero with the characteristic bell curve.
QQ Plot of Residuals: Should show points following approximately a straight diagonal line from lower left to upper right. Systematic deviations from this line suggest departures from normality.
The residuals should behave like a random sample from \(N(0, \sigma^2)\), so standard normality assessment techniques apply directly.
Comprehensive Diagnostic Examples

Fig. 13.39 Multiple examples showing different combinations of assumption violations and their appearance in diagnostic plots
Example 1: Good Model Fit
Scatter plot shows clear linear trend with consistent spread
Residual plot shows random scatter with no patterns
Histogram of residuals approximates normal distribution
QQ plot shows points close to diagonal line
Example 2: Non-linearity Problem
Scatter plot shows slight curvature
Residual plot reveals systematic curved pattern
Normality plots may look reasonable since the issue is functional form, not error distribution
The lesson: visual inspection of multiple diagnostic plots provides complementary information about different aspects of model adequacy.
13.3.4. The F-Test for Model Utility
Once we’ve verified that our model assumptions are reasonably satisfied, we can proceed with statistical inference. The first question we typically ask is: “Does our simple linear regression model provide useful information about the relationship between our explanatory and response variables?”
Understanding the ANOVA Decomposition

Fig. 13.40 Complete ANOVA table for simple linear regression showing all components and formulas
The F-test for model utility builds on the ANOVA decomposition we developed in Chapter 13.2, but now we understand it in the context of hypothesis testing about model usefulness.
The Fundamental Identity:
This decomposes the total variability in our response variable into two meaningful components:

Fig. 13.41 Visual explanation of Sum of Squares Total as baseline variability ignoring the explanatory variable
Sum of Squares Total (SST):
SST measures how much the response values deviate from their overall mean, completely ignoring any information from the explanatory variable. If there were no explanatory variable, \(\bar{y}\) would be our best estimate for modeling the response.

Fig. 13.42 Visual explanation of Sum of Squares Regression as improvement over the baseline mean model
Sum of Squares Regression (SSR):
SSR measures how much the fitted values deviate from the overall mean response. This quantifies the improvement we get by using the linear relationship instead of simply averaging all response values.
Model Utility Interpretation: If our model is useful, we want \(\hat{y}_i\) values to be different from \(\bar{y}\). If \(\hat{y}_i \approx \bar{y}\) for all observations, our explanatory variable provides no additional information beyond the overall mean.
Connection to the Slope: Recall that \(\text{SSR} = b_1^2 \sum_{i=1}^n (x_i - \bar{x})^2\). If the slope \(b_1 \to 0\), then \(\text{SSR} \to 0\), indicating no linear relationship.

Fig. 13.43 Visual explanation of Sum of Squares Error as unexplained variation after fitting the model
Sum of Squares Error (SSE):
SSE measures how much the observed values deviate from the fitted line—the unexplained variability. This is exactly what we minimized when fitting the least squares regression line.
What We Want: We want SSE to be small relative to SST, meaning our model explains most of the variation. If SSE ≈ SST, our model provides little improvement over simply using \(\bar{y}\).
Understanding Degrees of Freedom: Multiple Perspectives

Fig. 13.44 Explanation of degrees of freedom concept and calculation for regression ANOVA
The concept of degrees of freedom is fundamental to understanding statistical inference, and there are several complementary ways to think about why we get \(n-2\) degrees of freedom for error in simple linear regression. Understanding these different perspectives will help you develop intuition for more complex statistical procedures.
The Information Lens: “Paying the Price of Estimation”
Every observation \(y_i\) initially contributes one independent piece of information about the population. However, when we estimate parameters from the data, we “use up” some of this information.
Think of it this way: we start with \(n\) independent pieces of random variation from our \(n\) observations. To estimate our intercept \(b_0\) and slope \(b_1\), we must impose constraints on the data that “consume” two pieces of this randomness:
Estimating \(b_0\) requires that \(\sum_{i=1}^n e_i = 0\) (residuals sum to zero)
Estimating \(b_1\) requires that \(\sum_{i=1}^n x_i e_i = 0\) (residuals are uncorrelated with X)
After paying this “price” of estimation, only \(n - 2\) independent pieces of information remain in the residuals.
Connection to Familiar Concepts: This matches the intuition you developed with one-sample t-tests, where estimating the sample mean \(\bar{x}\) used up 1 degree of freedom, leaving \(n-1\) degrees of freedom for the sample variance. Here, estimating two parameters uses up 2 degrees of freedom.
The Constraint Lens: “Equations the Data Must Satisfy”
When we fit \(Y_i = \hat{y}_i + e_i\) using least squares, we’re solving an optimization problem. The solution must satisfy exactly two linear constraints:
These aren’t just mathematical curiosities—they’re fundamental requirements that our residuals must satisfy. With \(n\) residual values but 2 constraints, only \(n-2\) residuals can vary independently. The remaining 2 are completely determined by these constraints.
General Pattern: In multiple regression with \(p\) explanatory variables plus an intercept, we estimate \(p+1\) parameters, creating \(p+1\) constraints on the residuals. This leaves \(n-(p+1)\) degrees of freedom for error.
The Geometric Lens: “Dimensions of Solution Spaces”
From a linear algebra perspective, we can think about the “space” of possible residual vectors. Our \(n\) observations define an \(n\)-dimensional space. When we fit the regression model, we project the data onto a 2-dimensional subspace (spanned by the intercept and slope).
The residuals live in the remaining \(n-2\) dimensional space—the part of the full \(n\)-dimensional space that’s orthogonal (perpendicular) to our fitted model. This geometric perspective shows that the dimension of the “leftover” space after fitting is exactly our error degrees of freedom.
The Distribution Lens: “Where the Chi-Square Comes From”
The degrees of freedom directly determine the shape of our sampling distributions. Under our normality assumptions:
\(\frac{\text{SSE}}{\sigma^2} \sim \chi^2_{n-2}\) (chi-square with \(n-2\) degrees of freedom)
This feeds into our t-statistics: \(t = \frac{b_1 - \beta_1}{SE(b_1)} \sim t_{n-2}\)
And our F-statistic: \(F = \frac{\text{MSR}}{\text{MSE}} \sim F_{1,n-2}\)
The \(n-2\) parameter isn’t arbitrary—it’s precisely the dimension of the space where our residuals can vary independently.
For Simple Linear Regression, Specifically:
df_regression = 1: We’re fitting one slope parameter (the intercept is determined once we choose the slope)
df_error = n - 2: After estimating intercept and slope, \(n-2\) residuals remain free to vary
df_total = n - 1: Total variation around the overall mean \(\bar{y}\) has the familiar \(n-1\) degrees of freedom
The degrees of freedom always sum: \((n-1) = 1 + (n-2)\).
Why Multiple Perspectives Matter
Different situations call for different ways of thinking about degrees of freedom:
The information lens helps with intuitive understanding and connects to familiar concepts
The constraint lens is useful when working with model equations and understanding why certain relationships hold
The geometric lens becomes powerful in multiple regression and advanced modeling
The distribution lens is essential for understanding test statistics and p-values
As you encounter more complex statistical procedures—ANOVA with multiple factors, multiple regression, mixed effects models—you’ll find that some of these perspectives provide clearer insight than others. Having multiple ways to think about the same concept makes you a more flexible and intuitive statistical thinker.
Conducting the F-Test: Step-by-Step Process

Fig. 13.45 Complete four-step process for conducting the F-test for model utility
Step 1: Parameter of Interest (Can be skipped)
For the model utility test, we’re not focusing on a specific parameter but rather on the overall usefulness of the linear relationship. We can skip the explicit parameter statement.
Step 2: Hypotheses
Important: Always state these hypotheses in the context of your specific problem, replacing “X” and “Y” with the actual variable names and context.
Step 3: Test Statistic and P-value
Degrees of freedom: \(df_1 = 1\), \(df_2 = n-2\)
P-value calculation in R:
`r
p_value <- pf(F_statistic, df1 = 1, df2 = n-2, lower.tail = FALSE)
`
Why Always Upper Tail? The F-distribution is right-skewed and bounded below by zero. Large F-values provide evidence against the null hypothesis (that the model is not useful), so we always calculate \(P(F > F_{\text{observed}})\).
Step 4: Decision and Conclusion
Compare the p-value to the significance level \(\alpha\):
If p-value ≤ \(\alpha\): Reject \(H_0\). We have evidence of a linear association.
If p-value > \(\alpha\): Fail to reject \(H_0\). We do not have sufficient evidence of a linear association.
Conclusion Template:
If rejecting \(H_0\): “At the \(\alpha\) significance level, we have sufficient evidence to conclude that there is a linear association between [explanatory variable] and [response variable] in [context].”
If failing to reject \(H_0\): “At the \(\alpha\) significance level, we do not have sufficient evidence to conclude that there is a linear association between [explanatory variable] and [response variable] in [context].”
13.3.5. Inference for Individual Parameters
While the F-test tells us whether our model provides useful information overall, we often want to make specific inferences about the slope and intercept parameters. This allows us to quantify the nature of the relationship and test specific hypotheses about the parameters.
Rewriting Estimates as Linear Combinations

Fig. 13.46 Rewriting slope and intercept estimates as linear combinations of the response values
To develop inference procedures for \(\beta_0\) and \(\beta_1\), we need to understand the statistical properties of our estimates \(b_0\) and \(b_1\). The key insight is rewriting these estimates as linear combinations of the response values \(Y_i\), since the responses are the only random components in our model.
Intercept Rewrite:
Starting from \(b_0 = \bar{y} - b_1\bar{x}\) and substituting the expression for \(b_1\), we can show:
Slope Rewrite:
Starting from the least squares formula and using algebraic manipulation:
Why This Matters: Both estimates are linear combinations (weighted averages) of the normally distributed response values \(Y_i\). Since linear combinations of independent normal random variables are also normally distributed, both \(b_0\) and \(b_1\) follow normal distributions under our model assumptions.
Expected Values and Variances
Through careful application of expectation and variance properties:
For the Intercept:
\(E[b_0] = \beta_0\) (unbiased estimate)
\(\text{Var}(b_0) = \sigma^2 \left(\frac{1}{n} + \frac{\bar{x}^2}{S_{xx}}\right)\)
For the Slope:
\(E[b_1] = \beta_1\) (unbiased estimate)
\(\text{Var}(b_1) = \frac{\sigma^2}{S_{xx}}\)
Distributions Under Normality:
Standard Errors and t-Distribution
Since \(\sigma^2\) is unknown, we estimate it using the mean squared error:
Standard Errors:
t-Distribution Result: When we replace \(\sigma^2\) with \(s^2\), the normal distribution becomes a t-distribution with \(n-2\) degrees of freedom:
Confidence Intervals for Parameters
General Form:
For the Slope:
For the Intercept:
Interpretation: We are \((1-\alpha) \times 100\%\) confident that the true parameter value lies within the calculated interval.
Critical Value in R:
`r
t_critical <- qt(alpha/2, df = n-2, lower.tail = FALSE)
`
Hypothesis Testing for Parameters
The Four-Step Process for Slope Testing:
Step 1: Parameter of Interest
We are interested in \(\beta_1\), the true slope of the population regression line relating [explanatory variable] to [response variable].
Step 2: Hypotheses
Most commonly:
\(H_0: \beta_1 = 0\) (no linear relationship)
\(H_a: \beta_1 \neq 0\) (linear relationship exists)
But we can test other values:
\(H_0: \beta_1 = \beta_{10}\) (for any specified value \(\beta_{10}\))
\(H_a: \beta_1 \neq \beta_{10}\) (or \(>\) or \(<\) for one-sided tests)
Step 3: Test Statistic and P-value
Degrees of freedom: \(n-2\)
P-value calculation depends on the alternative hypothesis: ```r # Two-sided test p_value <- 2 * pt(abs(t_stat), df = n-2, lower.tail = FALSE)
# Upper tail test p_value <- pt(t_stat, df = n-2, lower.tail = FALSE)
# Lower tail test p_value <- pt(t_stat, df = n-2, lower.tail = TRUE) ```
Step 4: Decision and Conclusion
Compare p-value to \(\alpha\) and draw conclusions about the slope parameter in context.
Connection Between F-test and t-test
For simple linear regression with one explanatory variable, there’s a direct mathematical relationship:
when testing \(H_0: \beta_1 = 0\).
Why Both Tests Matter: In simple linear regression, the F-test and t-test for \(\beta_1 = 0\) are equivalent. However, in multiple regression:
The F-test assesses overall model utility (are any of the explanatory variables useful?)
Individual t-tests assess each explanatory variable separately (is this specific variable useful?)
Both perspectives provide valuable but different information about model components.
13.3.6. Implementation in R
Fitting the Model:
```r # Fit linear model fit <- lm(response ~ explanatory, data = dataset_name)
# Extract coefficients coefficients(fit) # or fit`coefficients
# Extract residuals and fitted values residuals(fit) # or fit`residuals fitted(fit) # or fit`fitted.values ```
Getting Inference Results:
```r # Complete summary with tests and R-squared summary(fit)
# ANOVA table for F-test anova(fit) # or summary(aov(fit)) ```
Manual Calculations (when given summary statistics rather than raw data):
```r # Calculate test statistics manually F_stat <- MSR / MSE t_stat <- (b1 - beta1_null) / SE_b1
# Calculate p-values p_value_F <- pf(F_stat, df1 = 1, df2 = n-2, lower.tail = FALSE) p_value_t <- 2 * pt(abs(t_stat), df = n-2, lower.tail = FALSE)
# Calculate confidence intervals margin_error <- qt(alpha/2, df = n-2, lower.tail = FALSE) * SE_b1 CI_lower <- b1 - margin_error CI_upper <- b1 + margin_error ```
13.3.7. Integrated Example: Blood Pressure Study Complete Analysis
Let’s work through a complete analysis of the blood pressure treatment study, incorporating all diagnostic and inference procedures.
Research Context: Investigating whether patient age affects the change in blood pressure after 24 hours of a new treatment (\(n = 11\) patients).
Step 1: Diagnostic Analysis
```r # Check assumptions plot(age, bp_change) # Scatter plot for linearity and homoscedasticity abline(lm(bp_change ~ age))
# Fit model and create residual plot fit <- lm(bp_change ~ age) plot(age, residuals(fit)) # Residual plot abline(h = 0)
# Check normality of residuals hist(residuals(fit)) qqnorm(residuals(fit)) qqline(residuals(fit)) ```
Assessment: Linear relationship appears reasonable, constant variance assumption shows some minor violations but nothing severe enough to invalidate analysis with \(n = 11\). Normality appears reasonable.
Step 2: F-test for Model Utility
Using our fitted model \(\hat{y} = 20.11 - 0.526x\) with \(\text{SSR} = 556\), \(\text{SSE} = 383\), \(\text{SST} = 939\):
Hypotheses:
\(H_0\): There is no linear association between patient age and change in blood pressure
\(H_a\): There is a linear association between patient age and change in blood pressure
Test Statistic:
\(\text{MSR} = \text{SSR}/1 = 556\)
\(\text{MSE} = \text{SSE}/9 = 42.56\)
\(F = 556/42.56 = 13.06\)
P-value: \(P(F_{1,9} > 13.06) = 0.0055\)
Conclusion: At \(\alpha = 0.05\), we reject \(H_0\) and conclude there is sufficient evidence of a linear association between patient age and change in blood pressure.
Step 3: Inference for the Slope
Parameter of Interest: \(\beta_1\), the true change in blood pressure reduction per year increase in patient age.
Hypotheses:
\(H_0: \beta_1 = 0\)
\(H_a: \beta_1 \neq 0\)
Test Statistic:
\(SE(b_1) = \sqrt{\text{MSE}/S_{xx}} = \sqrt{42.56/2008} = 0.146\)
\(t = \frac{-0.526 - 0}{0.146} = -3.61\)
P-value: \(P(|t_9| > 3.61) = 2 \times P(t_9 > 3.61) = 0.0055\)
95% Confidence Interval:
\(t_{0.025,9} = 2.262\)
\(-0.526 \pm 2.262 \times 0.146 = -0.526 \pm 0.330 = (-0.856, -0.196)\)
Conclusion: We are 95% confident that each additional year of age is associated with a decrease in blood pressure between 0.196 and 0.856 mm Hg after treatment.
Note: The F-test and t-test give identical p-values (0.0055) since \(F = t^2 = (-3.61)^2 = 13.03 \approx 13.06\) (small rounding differences).
13.3.8. Parameter Distribution Properties

Fig. 13.47 Summary of the statistical properties of slope and intercept estimators showing unbiasedness and variance formulas
The mathematical foundation for our inference procedures rests on the key statistical properties of our parameter estimates.
Slope Properties:
Unbiased: \(E[b_1] = \beta_1\)
Variance: \(\text{Var}(b_1) = \frac{\sigma^2}{S_{xx}}\)
Distribution: \(b_1 \sim N\left(\beta_1, \frac{\sigma^2}{S_{xx}}\right)\)
Intercept Properties:
Unbiased: \(E[b_0] = \beta_0\)
Variance: \(\text{Var}(b_0) = \sigma^2\left(\frac{1}{n} + \frac{\bar{x}^2}{S_{xx}}\right)\)
Distribution: \(b_0 \sim N\left(\beta_0, \sigma^2\left(\frac{1}{n} + \frac{\bar{x}^2}{S_{xx}}\right)\right)\)
Key Insights:
Both estimates are linear combinations of the normally distributed response values
The slope variance depends only on the error variance and the spread of X values
The intercept variance includes additional uncertainty when \(\bar{x} \neq 0\)
Standard Error Formulas and Confidence Intervals

Fig. 13.48 Confidence interval formula for the slope parameter with all components clearly labeled
Since \(\sigma^2\) is unknown, we estimate it using \(s^2 = \text{MSE}\) and use the t-distribution:
Standard Error of the Slope:
where \(S_{xx} = \sum_{i=1}^n (x_i - \bar{x})^2\).
Confidence Interval for the Slope:
This provides a range of plausible values for the true population slope \(\beta_1\) with \((1-\alpha) \times 100\%\) confidence.
Interpretation: We are \((1-\alpha) \times 100\%\) confident that the true change in the response variable for each one-unit increase in the explanatory variable lies within this interval.
Complete Hypothesis Testing Framework

Fig. 13.49 Complete framework for hypothesis testing about the slope parameter
Step 1: Parameter of Interest
We are interested in \(\beta_1\), the true population slope of the mean response line \(\mu_{Y|X=x}\).
Step 2: Hypotheses
The general form allows for various null values \(\beta_{10}\):
Alternative formulations:
One-sided upper: \(H_a: \beta_1 > \beta_{10}\)
One-sided lower: \(H_a: \beta_1 < \beta_{10}\)
Step 3: Test Statistic and P-value
The test statistic follows a t-distribution with \(n-2\) degrees of freedom.
P-value calculation depends on the alternative:
Two-sided: \(\text{p-value} = 2P(t_{n-2} > |t|)\)
Upper tail: \(\text{p-value} = P(t_{n-2} > t)\)
Lower tail: \(\text{p-value} = P(t_{n-2} < t)\)
Step 4: Decision and Conclusion
Compare p-value to \(\alpha\) and state conclusions in context of the problem.
Special Case: Equivalence of F-test and t-test

Fig. 13.50 Mathematical relationship between F-test and t-test when testing slope equals zero
When testing \(H_0: \beta_1 = 0\) versus \(H_a: \beta_1 \neq 0\), there’s a direct mathematical relationship:
Specifically:
This equivalence means both tests provide identical p-values and lead to identical conclusions for this specific hypothesis.
Why Both Tests Matter: While equivalent in simple linear regression, they serve different purposes in multiple regression:
F-test: Tests overall model utility (are any predictors useful?)
t-tests: Test individual predictors (is this specific predictor useful?)
The F-test is more general and extends naturally to multiple regression scenarios where we test several slopes simultaneously.
13.3.9. Implementation in R: Complete Workflow

Fig. 13.51 Complete R workflow for fitting models and conducting inference
Model Fitting and Basic Output:
# Fit the linear model
fit <- lm(response_variable ~ explanatory_variable, data = dataFrame)
# Extract key components
coefficients(fit) # Get b0 and b1
residuals(fit) # Get residuals for diagnostics
fitted.values(fit) # Get predicted values
# Complete inference summary
summary(fit) # Includes R², F-test, t-tests, standard errors
# ANOVA table
anova(fit) # Or summary(aov(fit))
What summary(fit) Provides:
Coefficient estimates (\(b_0\), \(b_1\)) with standard errors
t-statistics and p-values for testing each coefficient equals zero
R-squared and adjusted R-squared
F-statistic and p-value for overall model utility
Residual standard error (estimate of \(\sigma\))
Manual Calculations (useful for understanding or when given summary statistics):
# Calculate standard error of slope manually
SE_b1 <- sqrt(MSE / Sxx)
# Calculate t-statistic
t_stat <- (b1 - beta1_null) / SE_b1
# Calculate confidence interval
t_critical <- qt(alpha/2, df = n-2, lower.tail = FALSE)
CI_lower <- b1 - t_critical * SE_b1
CI_upper <- b1 + t_critical * SE_b1
# Calculate p-values
p_value_two_sided <- 2 * pt(abs(t_stat), df = n-2, lower.tail = FALSE)
p_value_F <- pf(F_stat, df1 = 1, df2 = n-2, lower.tail = FALSE)
13.3.10. Comprehensive Example: Blood Pressure Study Final Analysis
Let’s complete our blood pressure analysis with full parameter inference.
Given Information:
Sample size: \(n = 11\)
Fitted model: \(\hat{y} = 20.11 - 0.526x\)
\(\text{SSE} = 383\), so \(\text{MSE} = 383/9 = 42.56\)
\(S_{xx} = 2008\) (from our previous calculations)
Slope Inference:
Standard Error:
\(SE(b_1) = \sqrt{42.56/2008} = \sqrt{0.0212} = 0.146\)
95% Confidence Interval:
\(t_{0.025,9} = 2.262\)
\(-0.526 \pm 2.262 \times 0.146 = -0.526 \pm 0.330\)
Interval: \((-0.856, -0.196)\)
Interpretation: We are 95% confident that each additional year of patient age is associated with an additional decrease in blood pressure between 0.196 and 0.856 mm Hg after treatment.
Hypothesis Test (\(H_0: \beta_1 = 0\) vs \(H_a: \beta_1 \neq 0\)):
\(t = \frac{-0.526 - 0}{0.146} = -3.61\)
p-value = \(2 \times P(t_9 > 3.61) = 0.0055\)
Conclusion: At \(\alpha = 0.05\), we reject \(H_0\) and conclude there is significant evidence that patient age affects the change in blood pressure after treatment.
Verification of F-test and t-test Equivalence:
\(t^2 = (-3.61)^2 = 13.03\)
Our F-statistic was 13.06 (small rounding differences)
Both tests give p-value ≈ 0.0055
13.3.11. Summary of Diagnostic and Inference Workflow
The complete regression analysis workflow involves these essential steps:
1. Exploratory Analysis
Create scatter plot to assess initial relationship
Determine which variable should be explanatory vs. response
Look for obvious outliers or non-linear patterns
2. Model Fitting
Fit least squares regression line
Calculate basic summary statistics and R-squared
3. Diagnostic Checking (Critical - do before inference!)
Create residual plots to check linearity and constant variance
Examine histograms and QQ plots of residuals for normality
Identify any influential points or assumption violations
4. Inference Procedures (only if assumptions are reasonable)
F-test for overall model utility
t-tests and confidence intervals for individual parameters
Interpret all results in context of the original problem
5. Model Use (if diagnostics and tests support the model)
Make predictions with appropriate uncertainty quantification
Draw scientific conclusions about the relationship
13.3.12. When Assumptions Are Violated
If diagnostic procedures reveal serious assumption violations:
Linearity Violations:
Consider transformations of variables (log, square root, etc.)
Fit polynomial or other non-linear models (beyond course scope)
Use piecewise or segmented regression for different regions
Constant Variance Violations:
Variable transformations may help stabilize variance
Weighted least squares methods (beyond course scope)
Robust regression techniques
Normality Violations:
Often less critical for large sample sizes (Central Limit Theorem)
Bootstrap methods for inference (beyond course scope)
Non-parametric alternatives
Independence Violations:
Time series methods for correlated observations
Mixed effects models for clustered data
These require specialized techniques beyond this course
Important Principle: When assumptions are seriously violated, our inference procedures may not be reliable. It’s better to address the violations or acknowledge limitations than to proceed with invalid analysis.
13.3.13. Building Toward Prediction and Advanced Topics
The diagnostic and inference tools developed in this chapter provide the foundation for the final components of regression analysis:
What’s Coming Next: - Prediction intervals: Using our fitted model to predict new observations with appropriate uncertainty - Confidence intervals for the mean response: Estimating expected values at specific X values - Model comparison and selection: Comparing different potential models - Introduction to multiple regression: Extending to multiple explanatory variables
The Bigger Picture: The workflow established here—visual exploration, model fitting, diagnostic checking, and formal inference—forms the backbone of all statistical modeling. Whether working with simple linear regression, multiple regression, or advanced modeling techniques, this systematic approach ensures reliable and interpretable results.
The combination of diagnostic tools and inference procedures gives us confidence in our conclusions while maintaining appropriate humility about the limitations of our models. This balance between statistical rigor and practical insight represents the essence of effective statistical analysis.
Key Takeaways 📝
Assumption checking must precede inference: Diagnostic plots are essential for verifying that our model is appropriate before conducting hypothesis tests or constructing confidence intervals.
Residual plots are more sensitive than scatter plots: They amplify patterns and make assumption violations easier to detect, particularly for constant variance and linearity.
Four key diagnostics work together: Scatter plots, residual plots, histograms of residuals, and QQ plots provide complementary information about different aspects of model adequacy.
The F-test assesses overall model utility: It tests whether the linear relationship explains a significant portion of the variability in the response variable.
ANOVA decomposition provides intuitive understanding: SST = SSR + SSE shows how total variation splits into explained and unexplained components.
Parameter inference follows familiar patterns: Confidence intervals and hypothesis tests for slope and intercept use the same principles as previous inference procedures, adapted for regression context.
Standard errors incorporate both error variance and design: SE(b₁) = √(MSE/Sₓₓ) shows that precision depends on both residual variation and spread of X values.
F-test and t-test are equivalent for simple regression: When testing β₁ = 0, both approaches give identical conclusions, but the F-test generalizes to multiple regression.
Degrees of freedom reflect parameters estimated: df = n-2 accounts for estimating both slope and intercept from the data.
R provides comprehensive output: The summary() function includes all essential inference results, while diagnostic plots require additional commands.
Violations have consequences: Serious assumption violations can invalidate inference procedures, requiring alternative approaches or model modifications.
Context drives interpretation: All statistical results must be interpreted in terms of the original research question and practical significance.
Exercises
Diagnostic Interpretation: For each residual plot pattern described, identify the assumption violation and potential consequences:
Residuals form a cone shape, spreading out as X increases
Residuals show a clear curved (U-shaped) pattern
Residuals appear randomly scattered around zero with constant spread
Most residuals are near zero with a few extremely large positive and negative values
Residuals show alternating positive and negative values in sequence
ANOVA Table Completion: Given the following partial ANOVA table for a regression with n = 15, complete all missing values:
Parameter Inference: A study of house prices yields the regression equation Price = 45,000 + 120 × Size, where Price is in dollars and Size is in square feet. With n = 20, MSE = 50,000,000, and Sₓₓ = 2500:
Calculate the standard error of the slope
Construct a 95% confidence interval for the slope
Test H₀: β₁ = 100 vs Hₐ: β₁ ≠ 100 at α = 0.05
Interpret the slope coefficient in context
F-test vs t-test: Using the house price data from Exercise 3:
Conduct the F-test for model utility
Conduct the t-test for H₀: β₁ = 0 vs Hₐ: β₁ ≠ 0
Verify that F = t² and explain why this relationship holds
Discuss when you might prefer one test over the other
Assumption Checking Protocol: Design a systematic approach for checking regression assumptions:
List the specific plots you would create and in what order
Describe what to look for in each plot
Explain how you would decide whether violations are serious enough to invalidate analysis
Suggest potential remedies for each type of violation
Real Data Analysis: Collect data on two quantitative variables of interest (at least 15 observations):
Create appropriate exploratory plots
Fit a simple linear regression model
Conduct complete diagnostic analysis
Perform F-test and parameter inference
Interpret all results in context
Discuss any limitations or concerns
R Implementation: Write R code to perform a complete regression analysis:
Fit the model and extract basic output
Create all necessary diagnostic plots
Conduct F-test and t-tests manually (not using summary output)
Calculate confidence intervals for the slope
Compare your manual calculations to R’s built-in results
Critical Evaluation: A researcher reports: “The regression has R² = 0.95, so the model is excellent and all assumptions are satisfied.”
What’s wrong with this reasoning?
What additional information would you need to evaluate the model?
Describe how a high R² could coexist with serious assumption violations
What would you recommend the researcher do?
Design Considerations: Explain how each factor affects the precision of slope estimation:
Increasing the sample size n
Increasing the range of X values observed
Reducing the error variance σ²
Changing from X values clustered together to X values spread out
Confidence Interval Interpretation: For each confidence interval interpretation, identify whether it’s correct or incorrect and explain:
“There’s a 95% chance that the true slope lies in this interval”
“95% of sample slopes will fall in this interval”
“If we repeated this study many times, 95% of the intervals would contain the true slope”
“We’re 95% confident about the slope value for this specific dataset”
Hypothesis Testing Scenarios: For each research scenario, formulate appropriate hypotheses:
Testing whether there’s any linear relationship between study hours and test scores
Testing whether the slope of salary vs. experience is at least $2000 per year
Testing whether the relationship between temperature and ice cream sales is negative
Testing whether the effect of fertilizer on plant growth is exactly 5 cm per gram
Comprehensive Case Study: A medical researcher studies the relationship between patient age (X) and recovery time in days (Y) for a surgical procedure. With n = 25 patients, the analysis yields:
Fitted model: Ŷ = 8.5 + 0.3X
SSR = 156, SSE = 234, SST = 390
Sₓₓ = 1200
Conduct a complete analysis including:
ANOVA table and R² calculation
F-test for model utility
95% confidence interval for the slope
Test whether the slope exceeds 0.25 days per year
Practical interpretation of all results
Discussion of what diagnostic plots you would need to see