t-Test of Different Types In this course, we’ve covered three distinct types of t-tests: one-sample, two-sample independent, and two-sample paired tests. Each serves a unique purpose in hypothesis testing, though they all operate under the same guiding principles and utilize the versatile t.test() function. However, the choice of data format, diagnostic plots and keywords depends on the specific inference procedure.

We previously discussed the one-sample t-test in Computer Assignment 6. In this tutorial, we focus on the two-sample independent test and the two-sample paired test.

t Procedures for Two Independent Samples

In a two-sample independent t-test, we aim to compare the means of a quantitative variable between two distinct groups. These groups must be independent, implying that the measurements in one group do not affect the measurements in the other group. For conducting such an analysis, you have two main approaches:

Two Vectors of Data: Each vector represents the measurements from one of the groups. This method requires manually separating your data based on group membership before conducting the test.
A Data Frame with a Factor Variable: This approach leverages a factor variable in your data frame that identifies the group membership for each observation, effectively categorizing your continuous variable of interest into two groups based on this factor.

Given its simplicity and direct alignment with R’s formula interface used in various other statistical functions, we will focus on the second approach. This method enhances readability and efficiency, particularly when your data is already organized within a single data frame.

Combining Data Categories with ifelse

When dealing with more than one group, such as in a two-sample procedure or ANOVA, you might find it useful to filter out categories that aren’t relevant to your analysis. This can be especially helpful when working with datasets containing numerous categories. R provides several ways to accomplish this task, with ifelse being one of the most straightforward approaches.

The ifelse function in R is a vectorized conditional function that allows you to replace values in a vector based on a condition. The syntax of ifelse is as follows:

ifelse(test_expression, yes, no)

test_expression is the logical condition based on which replacements are performed.
yes is the value to return if test_expression is TRUE.
no is the value to return if test_expression is FALSE.

We’ll create an illustrative example to show how ifelse can be used in practical scenarios.

Example: Movie Profitability Statistics

(Data Set: movies.csv) In the film industry, understanding the financial performance of movies through different lenses, such as audience ratings, is crucial for stakeholders. This understanding helps in tailoring future productions to meet audience expectations and optimize profitability. The movies.csv dataset provides a snapshot of various movies’ profitability metrics, including LOpening, which represents the log-transformed revenue from the opening weekend per theater. This transformation helps in normalizing revenue data, making it more amenable to statistical analysis.

Our analysis focuses on exploring how movies rated for different audiences—‘R’ for adults and a combined ‘Family’ category including ‘PG’ and ‘PG-13’ ratings—fare in terms of their opening weekend revenue per theater.

To compare these groups effectively, we first redefine our movie ratings into two distinct categories: ‘Adult’ for ‘R’ rated movies and ‘Family’ for movies rated ‘PG’ and ‘PG-13’. This re-categorization is captured in a new variable within our dataset, MergedRating:

movies <- read.csv("Data/movies.csv")
kable(movies, caption = "Movie Profitability Data")

(#tab:generate_data)Movie Profitability Data
Title	Rating	Genre	Budget	USRevenue	Opening	LOpening	Theaters	Opinion	Profit
Madagascar: Escape 2 Africa	PG	Animation	150.0	180.0	63.1	4.145	4056	6.9	1
Sex and the City	R	Comedy	65.0	152.6	56.8	4.040	3285	5.4	1
The Ruins	R	Horror	8.0	17.4	8.0	2.079	2812	6.0	1
Stop-Loss	R	Drama	25.0	10.9	4.6	1.526	1291	6.5	0
The Curious Case of Benjamin Button	PG-13	Drama	150.0	127.5	26.9	3.292	2988	8.0	0
Redbelt	R	Action	7.0	2.3	1.1	0.095	1379	6.9	0
The Secret Life of Bees	PG-13	Drama	11.0	37.8	10.5	2.351	1591	7.0	1
Kung Fu Panda	PG	Animation	130.0	215.4	60.2	4.098	4114	7.7	1
The Happening	R	Drama	60.0	64.5	30.5	3.418	2986	5.2	1
Zach and Miri Make a Porno	R	Comedy	24.0	31.5	10.1	2.313	2735	7.1	1
The Strangers	R	Horror	10.0	52.5	21.0	3.045	2466	6.0	1
Prom Night	PG-13	Horror	20.0	43.8	20.8	3.035	2700	3.6	1
The Dark Knight	PG-13	Action	185.0	533.3	158.4	5.065	4366	8.9	1
Baby Mama	PG-13	Comedy	30.0	60.3	17.4	2.856	2543	6.1	1
Wanted	R	Action	75.0	134.3	50.9	3.930	3175	6.8	1
Changeling	R	Drama	55.0	35.7	10.0	2.303	1850	8.0	0
Yes Man	PG-13	Comedy	70.0	97.7	18.3	2.907	3434	7.0	1
The Express	PG	Drama	40.0	9.6	4.6	1.526	2808	7.1	0
W.	PG-13	Drama	25.1	25.5	10.5	2.351	2030	6.6	1
The Mummy: Tomb of the Dragon Emporer	PG-13	Action	145.0	102.2	40.5	3.701	3760	5.1	0
Eagle Eye	PG-13	Action	80.0	101.1	29.2	3.374	3510	6.6	1
Burn After Reading	R	Comedy	37.0	60.3	19.1	2.950	2651	7.2	1
Saw V	R	Horror	10.8	56.7	30.1	3.405	3060	5.8	1
Miracle and St Anna	R	Action	45.0	7.9	3.5	1.253	1185	5.9	0
The Day the Earth Stood Still	PG-13	Drama	80.0	79.4	30.5	3.418	3560	5.5	0
Be Kind Rewind	PG-13	Comedy	20.0	11.2	4.1	1.411	808	6.6	0
Jumper	PG-13	Action	85.0	80.2	32.1	3.469	3428	5.9	0
Hancock	PG-13	Action	150.0	227.9	62.6	4.137	3965	6.5	1
Speed Racer	PG	Action	120.0	43.9	18.6	2.923	3606	6.3	0
The Eye	R	Drama	12.0	31.4	12.4	2.518	2436	5.3	1
Death Race	R	Action	45.0	36.1	12.6	2.534	2532	6.6	0
College	R	Comedy	6.5	4.7	2.6	0.956	2123	4.3	0
Blindness	R	Drama	25.0	3.1	2.0	0.693	1690	6.7	0
Iron Man	PG-13	Action	140.0	318.3	102.1	4.626	4105	8.0	1
Lakeview Terrace	PG-13	Drama	22.0	39.3	15.0	2.708	2464	6.3	1

movies$MergedRating <- ifelse(movies$Rating == "PG" | movies$Rating == "PG-13", "Family", "Adult") 
kable(head(movies), caption = "Movie Profitability Data")

Table 1: Movie Profitability Statistics
Title	Rating	Genre	Budget	USRevenue	Opening	LOpening	Theaters	Opinion	Profit	MergedRating
Madagascar: Escape 2 Africa	PG	Animation	150	180.0	63.1	4.145	4056	6.9	1	Family
Sex and the City	R	Comedy	65	152.6	56.8	4.040	3285	5.4	1	Adult
The Ruins	R	Horror	8	17.4	8.0	2.079	2812	6.0	1	Adult
Stop-Loss	R	Drama	25	10.9	4.6	1.526	1291	6.5	0	Adult
The Curious Case of Benjamin Button	PG-13	Drama	150	127.5	26.9	3.292	2988	8.0	0	Family
Redbelt	R	Action	7	2.3	1.1	0.095	1379	6.9	0	Adult

Refer back to Computer Assignment #6 Tutorial for information regarding logical operators.

Two-sample Independent procedure

Hypothesis Testing Framework

Test Selection: For our purpose, a two-sample independent t-test is appropriate as it compares means between two distinct groups that are not related or paired. This test suits our scenario since each movie is unique and falls into one of two independent categories, ‘Adult’ or ‘Family’.
Alternative Hypothesis: We aim to determine if there’s a significant difference in profitability (as measured by log opening revenue, LOpening) between ‘Adult’ and ‘Family’ movies. Hence, our alternative hypothesis could be that the mean LOpening for ‘Adult’ movies is different from ‘Family’ movies.
Data Visualization: To understand the distribution of LOpening for each category, we generate histograms and boxplots.

First, calculate group level statistics and density.

# Calculate the sample mean and standard deviation for each group
xbar <- tapply(movies$LOpening, movies$MergedRating, mean)
s <- tapply(movies$LOpening, movies$MergedRating, sd)

# Create estimated normal density curves for each group
movies$normal.density <- ifelse(movies$MergedRating == "Family", 
                                 dnorm(movies$LOpening, xbar["Family"], s["Family"]), 
                                 dnorm(movies$LOpening, xbar["Adult"], s["Adult"]))

To ensure accurate comparision between the two groups in the histogram we need to use the ‘facet_grid()’ function from the ggplot2 package, designed to create a grid of plots based on the values of the levels of our factor. It allows for the simultaneous visualization of subsets of data across different categories, facilitating comparisons and highlighting differences or patterns within the data.

binLen <- as.numeric(max(tapply(movies$LOpening, movies$MergedRating,length)))
n_bins <- round(max(sqrt(binLen)+2, 5))


ggplot(movies, aes(x = LOpening)) + 
  geom_histogram(aes(y = after_stat(density)), bins = n_bins, fill = "grey", col = "black") + 
  facet_grid(. ~ MergedRating) +
  geom_density(col = "red", lwd = 1) + 
  geom_line(aes(y = normal.density), col = "blue", lwd = 1) + 
  labs(title = "Distribution of Log Opening Revenue by Rating Category")

Create boxplots for both ‘Family’ and ‘Adult’ rating categories. Boxplots are instrumental in visualizing the central tendency and variability of data. By designating a categorical variable for the x-axis, we can generate side-by-side boxplots, facilitating an effortless comparison between the two groups. This visual comparison can help highlight differences in the distribution of log opening weekend revenue per theater across rating categories, providing insights into how movie ratings may influence financial performance.

ggplot(movies, aes(x = MergedRating, y = LOpening)) +
  geom_boxplot() +
  stat_boxplot(geom = "errorbar") +
  stat_summary(fun = mean, colour = "black", geom = "point", size = 3) +
  ggtitle("Boxplots of Log Opening Revenue by Rating Category")

Diagnostics Determine if the assumptions are valid to perform inference in this situation. You do not need to repeat any graphs that were presented in part c). Additional plots may be needed. Be sure that you list all of the assumptions whether they can be determined from the graphs or not.

Calculating Slope and Intercept for Reference Lines

For each rating category, we calculate the slope and intercept of the reference line that would represent a perfectly normal distribution. These calculations allow ggplot2 to draw the reference lines accurately for each category in the Q-Q plots:

movies$intercept <- ifelse(movies$MergedRating == "Family", xbar["Family"], xbar["Adult"])
movies$slope <- ifelse(movies$MergedRating == "Family", s["Family"], s["Adult"])

With the intercept and slope prepared, we proceed to construct Q-Q plots for LOpening within the ‘Family’ and ‘Adult’ groups, facilitating a comparison of their distributions to a normal distribution:

ggplot(movies, aes(sample = LOpening)) +
  stat_qq() +
  facet_grid(MergedRating ~ .) +
  geom_abline(aes(intercept = intercept, slope = slope), color = "blue", linetype = "dashed") +
  ggtitle("Q-Q Plots of Log Opening Revenue by Rating Category")

Conducting the T-Test

Carry out Hypothesis Since the assumptions are valid we carry out the hypothesis test at a 0.01 significance level to determine if movies rated as ‘Family’ compared to those rated as ‘Adult’differ with respect to the log-transformed opening weekend revenue per theater (LOpening).

For this analysis, we use the formula interface of the t.test() function, which allows for a concise specification of the groups being compared:

t.test(LOpening ~ MergedRating, data = movies, 
                        mu = 0, conf.level = 0.99, 
                        paired = FALSE, alternative = "two.sided", 
                        var.equal = FALSE)

## 
##  Welch Two Sample t-test
## 
## data:  LOpening by MergedRating
## t = -2.5144, df = 29.12, p-value = 0.0177
## alternative hypothesis: true difference in means between group Adult and group Family is not equal to 0
## 99 percent confidence interval:
##  -1.917939  0.087768
## sample estimates:
##  mean in group Adult mean in group Family 
##             2.316125             3.231211

t Procedures for Two-Sample Matched Pairs

In situations where you’re comparing two related groups—such as before-and-after measurements in a controlled experiment or matched pairs in observational studies—a two-sample paired t-test provides a powerful tool for analysis. This test focuses on the differences between paired observations, which means you’ll need to create a new variable representing these differences.

Creating a Difference Variable

To conduct a paired t-test, the first step involves calculating the differences between each pair of matched observations. This new variable, let’s call it diff, captures the essence of the paired design by isolating and emphasizing the change or effect of interest.

The direction in which you calculate these differences (i.e., variable1 - variable2 vs. variable2 - variable1) is a matter of context or convention and does not influence the statistical validity of the test. However, it’s essential to be consistent with the hypothesized direction of the effect. For example, if you’re expected to estimate the mean difference of a - b, then your difference calculation should reflect this order.

Code for Creating the Difference Variable While we won’t repeat the specifics here, remember that creating this difference variable can be achieved with simple subtraction, it typically looks something like this:

# Assuming 'data' is your dataframe, and 'before' and 'after' are the paired observations
data$diff <- data$after - data$before

Example: Fuel efficiency comparison

(Data Set: ex07-39mpgdiff.csv) Fuel efficiency comparison. A researcher records the mpg (miles per gallon, a measurement of the fuel economy) of his car each time he fills the tank. He did this by dividing the miles driven since the last fill-up by the amount of gallons pumped at fill-up. He wants to determine if these calculations differ from what his car’s computer estimates.

For the paired t-test, we focus on the Diff variable, which represents the difference between computer estimates and driver measurements. This variable highlights the change or discrepancy of interest, serving as the basis for our analysis. If this variable was not already calculated we would need to obtain it as mentioned above.

mpg <- read.csv("Data/ex07-39mpgdiff.csv")
kable(mpg, caption = "Miles Per Gallon Data")

(#tab:generate_data_fuel)Miles Per Gallon Data
Fill.up	Computer	Driver	Diff
1	41.5	36.5	5.0
2	50.7	44.2	6.5
3	36.6	37.2	-0.6
4	37.3	35.6	1.7
5	34.2	30.5	3.7
6	45.0	40.5	4.5
7	48.0	40.0	8.0
8	43.2	41.0	2.2
9	47.7	42.8	4.9
10	42.2	39.2	3.0
11	43.2	38.8	4.4
12	44.6	44.5	0.1
13	48.4	45.4	3.0
14	46.4	45.3	1.1
15	46.8	45.7	1.1
16	39.2	34.2	5.0
17	37.3	35.2	2.1
18	43.5	39.8	3.7
19	44.3	44.9	-0.6
20	43.3	47.5	-4.2

Test Selection: For our analysis, a two-sample paired t-test is ideal since it compares the means of related observations. Here, each pair of observations consists of the MPG as calculated by the car’s computer and as measured by the driver for the same fill-up, making them inherently paired. This test allows us to assess if there’s a statistically significant difference between the computer’s estimates and the driver’s measurements.
Alternative Hypothesis: We aim to determine whether there’s a significant discrepancy between the car’s computer MPG estimates and the driver’s MPG measurements. Thus, our alternative hypothesis posits that the mean difference between the computer’s estimates and the driver’s measurements is not equal to zero, indicating a systematic bias in either the computer’s or the driver’s favor.
Data Visualization: To visualize the distribution of MPG differences (Computer MPG - Driver MPG), histograms and boxplots can be informative. These plots will help us understand the spread and central tendency of the MPG differences, alongside any potential outliers or skewness in the data. The code is similar to one-sample procedures and will not be repeated.

Diagnostics Determine if the assumptions are valid to perform inference in this situation. You do not need to repeat any graphs that were presented in part c). Additional plots may be needed. Be sure that you list all of the assumptions whether they can be determined from the graphs or not.The code is similar to one-sample procedures and will not be repeated.

Conducting the T-Test

Carry out Hypothesis The outlier is suspect but it does not seem too large with respect to the scale. Since the assumptions are valid we carry out the hypothesis test at a 0.05 significance level for testing if there is a significant difference between the computer’s estimates and the driver’s measurements.

For this analysis, we use the formula interface of the t.test() function, which allows for a concise specification of the groups being compared. Notice we can either use the ‘Diff’ variable as one-sample procedure or use the two variables ‘Computer’ and ‘Driver’ and use a paired procedure to get the same results:

One-Sample Approach Using the ‘Diff’ Variable: If we choose to focus on the already calculated differences between the car’s computer estimates and the driver’s measurements (Diff), we can apply a one-sample t-test. This approach treats the set of differences as a single sample being tested against a hypothesized mean difference of zero.

t.test.results <- t.test(mpg$Diff, mu = 0, conf.level = 0.95, alternative = "two.sided")
t.test.results

## 
##  One Sample t-test
## 
## data:  mpg$Diff
## t = 4.358, df = 19, p-value = 0.0003386
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  1.418847 4.041153
## sample estimates:
## mean of x 
##      2.73

Paired Two-Sample Approach Using ‘Computer’ and ‘Driver’ Variables: Alternatively, we can directly compare the Computer and Driver variables using a paired two-sample t-test. This method implicitly calculates the differences between each pair of corresponding observations, aligning closely with the nature of our data as paired measurements from the same fill-up events.

t.test(mpg$Computer, mpg$Driver, mu = 0, conf.level = 0.95, alternative = "two.sided", paired = TRUE)

## 
##  Paired t-test
## 
## data:  mpg$Computer and mpg$Driver
## t = 4.358, df = 19, p-value = 0.0003386
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  1.418847 4.041153
## sample estimates:
## mean difference 
##            2.73

R Tutorial for CA 4: Two-sample Procedures

Authors:
Leonore Findsen, Timothy Reese,
Sarah H. Sellke, Halin Shin, Chunyan Sun, Jeremy Troisi
STAT 350

t Procedures for Two Independent Samples

Combining Data Categories with ifelse

Example: Movie Profitability Statistics

Two-sample Independent procedure

Conducting the T-Test

t Procedures for Two-Sample Matched Pairs

Example: Fuel efficiency comparison

Conducting the T-Test

R Tutorial for CA 4: Two-sample Procedures

Authors: Leonore Findsen, Timothy Reese, Sarah H. Sellke, Halin Shin, Chunyan Sun, Jeremy Troisi STAT 350

t Procedures for Two Independent Samples

Combining Data Categories with ifelse

Example: Movie Profitability Statistics

Two-sample Independent procedure

Conducting the T-Test

t Procedures for Two-Sample Matched Pairs

Example: Fuel efficiency comparison

Conducting the T-Test

Authors:
Leonore Findsen, Timothy Reese,
Sarah H. Sellke, Halin Shin, Chunyan Sun, Jeremy Troisi
STAT 350