Worksheet 5: Expected Value and Variance

Learning Objectives 🎯

Calculate expected values for discrete random variables
Apply the Law of the Unconscious Statistician (LOTUS)
Use linearity of expectation for transformed random variables
Compute variance and understand its properties
Analyze sums of random variables (expected value and variance)
Work with joint probability mass functions and covariance

Introduction

The expected value is a fundamental concept in probability and statistics that represents the long-term average or mean value of a random variable. It tells you what you can expect to happen on average if you repeat a random process many times.

For a discrete random variable \(X\) with support \(\text{Supp} = \{x \in \mathbb{R} | P(X = x) > 0\}\), the expected value of \(X\), denoted as \(E[X]\) or sometimes \(\mu\) is defined as:

\[E[X] = \sum_{x \in \text{Supp}} x \cdot P(X = x)\]

provided the sum converges absolutely, that is:

\[\sum_{x \in \text{Supp}} |x| \cdot P(X = x) < \infty\]

In this definition it is clear that \(E[X]\) is simply a weighted average of each outcome of the random variable where the weights are given by its probability of occurrence.

Part 1: Expected Value and LOTUS

Often, we are not only interested in the random variable \(X\) itself but also in functions of \(X\). If \(g: \mathbb{R} \to \mathbb{R}\) is any function, then \(g(X)\) is a new random variable. Its expected value is given by the Law of the Unconscious Statistician (LOTUS):

\[E[g(X)] = \sum_{x \in \text{Supp}} g(x) \cdot P(X = x)\]

Specific Functions of Interest:

Identity Function (\(g(x) = x\)): This recovers the definition of the expected value of \(X\)
Linear Function (\(g(x) = ax + b\), where \(a \neq 0\)): This function is particularly important because of the linearity property of expectation
- Linearity of Expectation: For a linear transformation of \(X\), the expected value is given by \(E[aX + b] = aE[X] + b\). In general, the expected value does not commute with the function i.e., \(E[g(X)] \neq g(E[X])\)
Square Function (\(g(x) = x^2\)): This is used to compute what is known as the second moment of \(X\), and is important for determining the variance of \(X\)
Squared Deviation Function (\(g(x) = (x - E[X])^2\)): This function directly leads to the definition of variance, measuring the dispersion/spread of \(X\) around its mean
Indicator Function (\(g(x) = \mathbf{1}_A(x)\), where \(\mathbf{1}_A(x) = 1\) if \(x \in A\) and \(0\) if \(x \notin A\)): Its expected value gives the probability i.e., \(E[\mathbf{1}_A(X)] = P(X \in A)\)

The indicator function is typically written using the notation:

\[\begin{split}\mathbf{1}_A(x) = \begin{cases} 1 & \text{if } x \in A \\ 0 & \text{if } x \notin A \end{cases}\end{split}\]

Question 1: In a distant galaxy, treasure hunters search for magical chests containing three coins. Each coin is selected independently from a set of two denominations with the following values and probabilities:

Gold Coin: 100 units (probability 0.2)
Silver Coin: 20 units (probability 0.8)

Let \(X\) be the random variable representing the total monetary value of the three coins drawn from the chest.

Complete the table for the probability mass function of this random variable \(X\).

\(x\)	60	140	220	300
\(p_X(x)\)

Determine the expected value of \(X\). The expected value \(E[X]\) represents the average total value of the coins collected from a chest, based on repeated sampling of many chests over time.
During the Festival of Stars, treasure hunters receive a special time-limited bonus when opening enchanted chests. During this period, the total monetary value of the coins is increased according to the following rule:

The new total value, represented by the random variable \(Y\), is given by the equation:

\[Y = 2X + 30\]

Using linearity of expectation determine the expected value of the new random variable \(Y\).
Treasure hunters sometimes pay a magical tax when they open a chest. This tax, represented by the random variable \(T(X)\), where the tax is defined as:

\[\begin{split}T(x) = \begin{cases} 0.1x & \text{if } x > 200 \\ 5 & \text{if } x \leq 200 \end{cases}\end{split}\]

Use the Law of Unconscious Statistician and the idea of indicator functions to determine the expected value of the magical tax. Start by writing out \(T(X)\) using the indicator function \(\mathbf{1}_{\{x > 200\}}(x) = \begin{cases} 1 & \text{if } x > 200 \\ 0 & \text{if } x \leq 200 \end{cases}\) and then compute the expected value of \(T(X)\).

After completing the calculations by hand use the R code below to verify your conclusions.

R Code for Calculations:

# Define coin probabilities
p_gold <- 0.2
p_silver <- 0.8

# Calculate PMF for 3 coins
# Possible outcomes: 0, 1, 2, or 3 gold coins
n_gold <- 0:3
n_silver <- 3 - n_gold

# Values for each outcome
values <- n_gold * 100 + n_silver * 20

# Probabilities using binomial
probs <- dbinom(n_gold, size = 3, prob = p_gold)

# Create PMF table
pmf_data <- data.frame(
  x = values,
  p_X = probs
)

print("PMF of X:")
print(pmf_data)

# Calculate E[X]
E_X <- sum(pmf_data$x * pmf_data$p_X)
cat("\nE[X] =", E_X, "\n")

# Festival bonus: Y = 2X + 30
E_Y <- 2 * E_X + 30
cat("E[Y] = E[2X + 30] =", E_Y, "\n")

# Tax calculation using indicator function
tax_function <- function(x) {
  ifelse(x > 200, 0.1 * x, 5)
}

E_tax <- sum(tax_function(pmf_data$x) * pmf_data$p_X)
cat("E[T(X)] =", E_tax, "\n")

Part 2: Variance and Its Properties

While the expected value tells us the long-term average outcome, variance measures the dispersion or spread of a random variable around its average.

For a random variable \(X\) with finite expected value \(E[X]\), the variance is defined as:

\[\text{Var}(X) = E[(X - E[X])^2] = \sum_{x \in \text{Supp}} (x - E[X])^2 \cdot P(X = x)\]

This definition represents the average of the squared deviations from the mean. It quantifies how much the outcomes of \(X\) differ from the expected value. An equivalent (and often more convenient) formula for variance is:

\[\text{Var}(X) = E[X^2] - (E[X])^2\]

Properties of Variance:

Non-Negativity: Variance is always non-negative: \(\text{Var}(X) \geq 0\). It equals zero if and only if \(X\) is a constant
Scaling: For any constant \(a \neq 0\), \(\text{Var}(aX) = a^2\text{Var}(X)\). If you scale a random variable, its variance is scaled by the square of the constant
Translation: For any constant \(b\), \(\text{Var}(X + b) = \text{Var}(X)\). If you shift a random variable by a constant it does not change the variance

Question 2: Continuing with the treasure hunter example above where \(X\) is a random variable representing the total monetary value of the three coins drawn from the chest.

Determine the variance of \(X\).
Determine the variance of the total monetary value of a random chest during the bonus round period during the Festival of Stars event, i.e., \(Y = 2X + 30\).
Using the equivalent formula for variance \(\text{Var}(X) = E[X^2] - (E[X])^2\) determine the variance of the magical treasure chest tax \(T(X)\) as defined in question 1. Start by figuring out what \([T(X)]^2\) looks like as a function, think about what the square of an indicator function would be.

After completing the calculations by hand use the R code below to verify your conclusions.

R Code for Variance Calculations:

# Calculate E[X^2]
E_X2 <- sum(pmf_data$x^2 * pmf_data$p_X)

# Variance of X
Var_X <- E_X2 - E_X^2
cat("Var(X) =", Var_X, "\n")
cat("SD(X) =", sqrt(Var_X), "\n")

# Variance of Y = 2X + 30
Var_Y <- 4 * Var_X  # Using Var(aX + b) = a^2 * Var(X)
cat("\nVar(Y) = Var(2X + 30) =", Var_Y, "\n")

# For tax variance, first find E[T(X)^2]
tax_squared <- function(x) {
  tax_function(x)^2
}

E_tax2 <- sum(tax_squared(pmf_data$x) * pmf_data$p_X)
Var_tax <- E_tax2 - E_tax^2
cat("\nVar(T(X)) =", Var_tax, "\n")

Part 3: Sums of Random Variables

When working with multiple random variables, we often need to determine the expected value of their sum or difference. The additivity property of expected value allows us to handle this efficiently.

Additivity of Expected Values:

For \(n\) random variables \(X_1, X_2, \ldots, X_n\), the expected value of their sum or difference is given by:

\[E[X_1 \pm X_2 \pm \cdots \pm X_n] = E[X_1] \pm E[X_2] \pm \cdots \pm E[X_n]\]

This property holds as long as the expected values of all random variables exist and converge absolutely. It does not require any special relationships (e.g., independence) between the random variables.

While expected value provides the long-term average, variance measures how much a random variable tends to deviate from its mean. For multiple random variables, we can determine how their variances combine under different conditions.

For \(n\) random variables \(X_1, X_2, \ldots, X_n\), the variance of their sum or difference will depend on if the random variables are independent or not. If they are dependent the variance of their sum is given by:

\[\text{Var}(X_1 + X_2 + \cdots + X_n) = \sum_{i=1}^n \text{Var}(X_i) + 2\sum_{i=1}^n \sum_{j>i} \text{Cov}(X_i, X_j)\]

where

\[\text{Cov}(X_i, X_j) = E[X_i X_j] - E[X_i]E[X_j]\]

If the random variables are independent, the covariance terms are zero, simplifying the formula to:

\[\text{Var}(X_1 \pm X_2 \pm \cdots \pm X_n) = \text{Var}(X_1) + \text{Var}(X_2) + \cdots + \text{Var}(X_n)\]

This simplification highlights that, under independence, the variance of the sum (or difference) of random variables is simply the sum of their individual variances.

Question 3: Now suppose the treasure hunters collect coins from two independent chests, represented by \(X_1\) and \(X_2\), where each \(X_i\) is the total monetary value of coins from one chest. We can use the properties of variance and expected value to analyze this scenario.

Determine the expected value of \(Y = X_1 + X_2\) which represents the expected total value of the coins collected from two chests, based on repeated sampling of two chests over time.
Determine the standard deviation of \(Y = X_1 + X_2\).

Part 4: Joint Probability Mass Functions

The probability of two discrete random variables \(X\) and \(Y\) is defined jointly and called the joint probability mass function.

Symbolic Representation: \(p_{X,Y}(x, y) = P(\{X = x\} \cap \{Y = y\})\)

Question 4: The following is a joint probability mass function for two random variables \(X\) and \(Y\).

Joint PMF \(p_{X,Y}(x,y)\)
\(x \backslash y\)	1	2	3	4
1	\(\tfrac{1}{144}\)	\(\tfrac{1}{72}\)	\(\tfrac{1}{288}\)	\(\tfrac{41}{288}\)
2	\(\tfrac{1}{144}\)	\(\tfrac{93}{144}\)	\(\tfrac{1}{144}\)	\(\tfrac{1}{144}\)
3	\(\tfrac{1}{144}\)	\(\tfrac{1}{72}\)	\(\tfrac{1}{288}\)	\(\tfrac{41}{288}\)

Determine the expected value of \(X\).
Determine the expected value for \(Y\).
Determine the expected value of \(Z = X + Y\).
Determine the expected value of the product of \(X\) and \(Y\), i.e., \(V = XY\).
Determine the covariance of \(X\) and \(Y\).
Finally determine the variance \(Z = X + Y\).

R Code for Joint PMF Analysis:

# Create joint PMF matrix
joint_pmf <- matrix(c(
  1/144, 1/72, 1/288, 41/288,
  1/144, 93/144, 1/144, 1/144,
  1/144, 1/72, 1/288, 41/288
), nrow = 3, byrow = TRUE)

rownames(joint_pmf) <- 1:3
colnames(joint_pmf) <- 1:4

# Calculate marginal distributions
marginal_X <- rowSums(joint_pmf)
marginal_Y <- colSums(joint_pmf)

# Expected values
E_X <- sum((1:3) * marginal_X)
E_Y <- sum((1:4) * marginal_Y)
cat("E[X] =", E_X, "\n")
cat("E[Y] =", E_Y, "\n")
cat("E[X + Y] =", E_X + E_Y, "\n")

# E[XY]
E_XY <- 0
for (i in 1:3) {
  for (j in 1:4) {
    E_XY <- E_XY + i * j * joint_pmf[i, j]
  }
}
cat("E[XY] =", E_XY, "\n")

# Covariance
Cov_XY <- E_XY - E_X * E_Y
cat("Cov(X,Y) =", Cov_XY, "\n")

# Variances
E_X2 <- sum((1:3)^2 * marginal_X)
E_Y2 <- sum((1:4)^2 * marginal_Y)
Var_X <- E_X2 - E_X^2
Var_Y <- E_Y2 - E_Y^2

# Variance of sum
Var_sum <- Var_X + Var_Y + 2 * Cov_XY
cat("Var(X + Y) =", Var_sum, "\n")

Key Takeaways

Summary 📝

Expected value \(E[X] = \sum x \cdot P(X = x)\) represents the long-term average
LOTUS allows us to find \(E[g(X)]\) without finding the PMF of \(g(X)\)
Linearity of expectation: \(E[aX + b] = aE[X] + b\)
Variance measures spread: \(\text{Var}(X) = E[X^2] - (E[X])^2\)
Properties: \(\text{Var}(aX + b) = a^2\text{Var}(X)\)
Additivity of expectation always holds; variance additivity requires independence
Covariance measures linear relationship: \(\text{Cov}(X,Y) = E[XY] - E[X]E[Y]\)