Course Datasets

Primary Course Dataset - AppRating

The AppRating dataset is the central dataset used throughout all computer assignments. This dataset contains app ratings and various metrics that students will analyze using different statistical techniques as they progress through the course.

Fall 2025 Session

Winter 2025 Session

Important

The AppRating dataset is used in all six computer assignments. Download the appropriate version for your session at the beginning of the course and use it throughout.

Tutorial Support Datasets

These datasets are available in the Computer Assignment Tutorials Data folder and are used for demonstrations and additional practice:

CSV Format Datasets

Text Format Datasets

Loading Datasets in R

Loading CSV files:

# From local file (after downloading)
d <- read.csv("data/helicon_m.csv")

# Directly from URL
d <- read.csv("https://treese41528.github.io/STAT350/Computer_Assignment_Tutorials/Data/helicon_m.csv")

Loading text files:

# Space-separated text file
d <- read.table("data/linebackers.txt", header = TRUE)

# Or if tab-separated
d <- read.table("data/ANOVA paxil.txt", header = TRUE, sep = "\t")

# From URL (note the %20 for space in filename)
d <- read.table("https://treese41528.github.io/STAT350/Computer_Assignment_Tutorials/Data/ANOVA%20paxil.txt",
                header = TRUE)

Built-in R Datasets Used in Course

The course also utilizes several built-in R datasets for examples and demonstrations:

Primary Built-in Datasets

  • iris - Fisher’s iris flower measurements (150 obs, 5 variables)

  • mtcars - Motor Trend car statistics (32 cars, 11 variables)

  • sleep - Student sleep data for paired t-tests (20 obs, 3 variables)

  • CO2 - Carbon dioxide uptake in grass plants (84 obs, 5 variables)

  • AirPassengers - Monthly airline passenger numbers (time series)

Additional Built-in Datasets for Practice

  • chickwts - Chicken weights by feed type (ANOVA examples)

  • PlantGrowth - Plant growth under different treatments

  • InsectSprays - Effectiveness of insect sprays

  • ToothGrowth - Tooth growth in guinea pigs

  • faithful - Old Faithful geyser eruption data

Loading Built-in Datasets

# Load a specific dataset
data(iris)

# View available datasets
data()

# Get help on a dataset
?iris

# View structure
str(iris)
head(iris)

Data Download and Organization

Recommended Folder Structure:

STAT350_Project/
├── data/
│   ├── AppRating.csv        # Your main dataset
│   ├── helicon_m.csv        # Tutorial datasets
│   └── ...other datasets
├── scripts/
│   ├── CA1.R
│   ├── CA2.R
│   └── ...
└── output/
    ├── figures/
    └── tables/

Download Instructions:

  1. Create project structure: Set up folders as shown above

  2. Download AppRating: Save your session’s version to data/ folder

  3. Download tutorial data: Save tutorial datasets as needed for each assignment

  4. Set working directory: Use RStudio Projects or setwd() to your project folder

Verification After Loading:

Always verify your data after loading:

# Check structure
str(d)

# Check dimensions
dim(d)

# Look for missing values
sum(is.na(d))

# Summary statistics
summary(d)

# First/last few rows
head(d)
tail(d)