luvv to helpDiscover the Best Free Online Tools
Topic 7 of 8

Expectation Variance Covariance

Learn Expectation Variance Covariance for free with explanations, exercises, and a quick test (for Data Scientist).

Published: January 1, 2026 | Updated: January 1, 2026

Why this matters

Data Scientists use expectation (average outcome), variance (uncertainty), and covariance (how two quantities move together) to make decisions and quantify risk. You will apply these when:

  • Estimating expected revenue or click-through from a campaign.
  • Summarizing model predictions and their uncertainty.
  • Combining features in linear models and understanding error propagation.
  • Diagnosing relationships between variables (are they moving together or in opposite directions?).

Concept explained simply

  • Expectation E[X]: the long-run average of X if you repeated the process many times.
  • Variance Var(X): how spread out X is around its average. Larger variance = more uncertainty.
  • Covariance Cov(X, Y): whether X and Y move together. Positive means they increase together; negative means when one increases, the other tends to decrease. Zero means no linear relationship.
  • Correlation ρ(X, Y): a standardized covariance in [-1, 1].
Mental model
  • Expectation: the balance point of the distribution.
  • Variance: average squared distance from the balance point (units squared).
  • Covariance: a signed measure of co-movement; think of two dancers moving in sync (positive), opposite (negative), or independently (near zero).

Key formulas and properties

  • Linearity of expectation: E[aX + b] = a E[X] + b; and E[aX + bY + c] = aE[X] + bE[Y] + c.
  • Variance: Var(X) = E[X^2] − (E[X])^2.
  • Scaling: Var(aX + b) = a^2 Var(X).
  • Sum of variables: Var(X + Y) = Var(X) + Var(Y) + 2 Cov(X, Y).
  • Independence: if X and Y are independent, then Cov(X, Y) = 0, so Var(X + Y) = Var(X) + Var(Y).
  • Covariance scaling: Cov(aX + b, cY + d) = ac Cov(X, Y).
  • Correlation: ρ(X, Y) = Cov(X, Y) / (σ_X σ_Y), where σ = standard deviation.
  • Law of total expectation: E[X] = E[ E[X | Y] ].
  • Law of total variance: Var(X) = E[ Var(X | Y) ] + Var( E[X | Y] ).

Worked examples

Example 1: Discrete — coin flips

Let X be the number of heads in 2 fair coin flips (Binomial n=2, p=0.5).

  • E[X] = np = 2 × 0.5 = 1.
  • Var(X) = np(1 − p) = 2 × 0.5 × 0.5 = 0.5.

Interpretation: on average 1 head; modest uncertainty.

Example 2: Continuous — Uniform(0,1)
  • E[X] = 0.5.
  • Var(X) = 1/12 ≈ 0.0833.

Interpretation: outcomes are evenly spread from 0 to 1 with low variance.

Example 3: Negative covariance — dice that sum to 7

Let X be a fair die (1–6), and Y = 7 − X. Then:

  • E[X] = 3.5, E[Y] = 3.5.
  • E[XY] = E[X(7 − X)] = 7E[X] − E[X^2] = 49/2 − 91/6 = 28/3.
  • Cov(X, Y) = E[XY] − E[X]E[Y] = 28/3 − (7/2)(7/2) = −35/12 ≈ −2.9167.

Interpretation: when X is high, Y must be low; strong negative linear relationship.

Example 4: Variance of a linear combination (with covariance)

Suppose S = 2F1 + 0.5F2 where Var(F1)=4, Var(F2)=1, Cov(F1,F2)=0.6.

  • Var(S) = 2^2 Var(F1) + 0.5^2 Var(F2) + 2·2·0.5·Cov(F1,F2) = 4×4 + 0.25×1 + 2×2×0.5×0.6 = 16 + 0.25 + 1.2 = 17.45.

Interpretation: covariance contributes to the overall uncertainty of the score.

Practice: do it now

Use this checklist when solving EV/Var/Cov problems:

  • Identify the random variables and what each represents.
  • Write down known parameters (means, variances, probabilities).
  • Choose the formula: linearity, variance identity, or covariance.
  • Compute step-by-step; keep units consistent (variance is in squared units).
  • Interpret results in plain language.
Exercise 1 (mirrors ex1): A/B revenue variance

Each visitor converts with probability p = 0.04. Each conversion yields 120 revenue units. Let X ~ Bernoulli(0.04), R = 120X.

  1. Compute E[R] and Var(R).
  2. For 1000 independent visitors, compute expected total revenue and its standard deviation.

Try it before viewing the solution.

Exercise 2 (mirrors ex2): Linear score with covariance

Let F1 and F2 be features with E[F1]=4, E[F2]=3, Var(F1)=1.5, Var(F2)=2.0, Cov(F1,F2)=−0.8. Define S = 2F1 + F2.

  1. Compute E[S].
  2. Compute Var(S).

Interpret what negative covariance does to the uncertainty of S.

Common mistakes and how to self-check

  • Forgetting that expectation is linear even when variables are dependent. Fix: always apply linearity first.
  • Dropping the 2ab Cov(X,Y) term for Var(X+Y) when variables are not independent. Fix: check independence before simplifying.
  • Confusing standard deviation with variance. Fix: SD = sqrt(Var).
  • Using E[XY] = E[X]E[Y] without independence. Fix: verify independence or compute E[XY] directly.
  • Ignoring units: variance has squared units. Fix: interpret SD for intuitive scale.
Self-check routine
  • State assumptions (independence?) explicitly.
  • Re-derive using both Var(X)=E[X^2]−E[X]^2 and linear-combination formulas; answers must match.
  • Sanity check: does variance become zero if the variable is constant?
  • Sign check: is covariance sign consistent with the story?

Practical projects

  • Campaign planning: Build a simple spreadsheet that takes conversion rate p and average order value A and outputs expected revenue and its SD for N visitors.
  • Feature combination risk: Given feature means, variances, and covariance, compute the mean and variance of a linear score S = w1F1 + w2F2. Explore how changing covariance changes SD(S).
  • Scenario analysis: Use Law of Total Variance by splitting users into segments (e.g., new vs returning) with different p; compute overall Var using E[Var|segment] + Var(E|segment).

Who this is for

  • Aspiring and practicing Data Scientists needing strong statistical fundamentals.
  • Analysts and ML engineers interpreting experiments and model outputs.

Prerequisites

  • Basic probability (events, distributions, independence).
  • Algebra with sums and squares; comfortable with averages.

Learning path

  1. Master expectation, variance, covariance basics (this lesson).
  2. Apply to common distributions (Bernoulli, Binomial, Normal).
  3. Use conditional expectation/variance in segmentation and Bayesian updates.
  4. Connect to correlation, regression, and error propagation.

Next steps

  • Complete the exercises above.
  • Take the Quick Test below to check understanding. The test is available to everyone; only logged-in users get saved progress.
  • Build one Practical Project from the list and write a short interpretation of results.

Mini challenge

A product’s weekly revenue R is 5 times the number of conversions C. Conversions C ~ Binomial(n=200, p=0.03). Estimate E[R] and SD(R). Hint: use Var(aX)=a^2 Var(X) and Var(Binomial)=np(1−p).

Reveal a quick solution sketch
  • E[C]=200×0.03=6; Var(C)=200×0.03×0.97=5.82.
  • R=5C ⇒ E[R]=5×6=30; Var(R)=25×5.82=145.5 ⇒ SD(R)=√145.5≈12.06.

Practice Exercises

2 exercises to complete

Instructions

Each visitor converts with probability p = 0.04. Each conversion yields 120 revenue units. Let X ~ Bernoulli(0.04) and R = 120X.

  1. Compute E[R] and Var(R).
  2. For 1000 independent visitors, compute expected total revenue and its standard deviation.
Expected Output
Per visitor: E[R] = 4.8; Var(R) = 552.96; SD ≈ 23.53. For 1000 visitors: E[sum] = 4800; Var(sum) = 552,960; SD ≈ 743.6.

Expectation Variance Covariance — Quick Test

Test your knowledge with 8 questions. Pass with 70% or higher.

8 questions70% to pass

Have questions about Expectation Variance Covariance?

AI Assistant

Ask questions about this tool