luvv to helpDiscover the Best Free Online Tools
Topic 8 of 9

Statistical Assumptions And Diagnostics

Learn Statistical Assumptions And Diagnostics for free with explanations, exercises, and a quick test (for Data Scientist).

Published: January 1, 2026 | Updated: January 1, 2026

Why this matters

As a Data Scientist, your models inform product decisions, experiments, and forecasts. Statistical assumptions are the guardrails that keep your inferences valid. Diagnostics are how you verify those guardrails are holding up. Skipping them can lead to wrong conclusions, wasted budget, and faulty product changes.

  • Real task: Validate an A/B test where group variances differ and traffic is time-dependent.
  • Real task: Ship a regression model with multicollinearity and outliers without overstating feature importance.
  • Real task: Check calibration and discrimination of a churn classifier before rollout.

Who this is for and prerequisites

Who this is for
  • Early-career Data Scientists and Analysts who build and evaluate models.
  • Engineers and Researchers who run experiments or predictive models.
Prerequisites
  • Comfort with basic probability and distributions.
  • Know linear and logistic regression at a basic level.
  • Know hypothesis testing (t-test/ANOVA) basics.

Concept explained simply

Assumptions are the conditions under which a method’s math holds. Diagnostics are tests, plots, and checks that tell you if those conditions are approximately true for your data.

Mental model: Treat your analysis like a vehicle. Assumptions are the safety rules (seatbelt, speed limit). Diagnostics are the dashboard sensors (fuel, engine light). You don’t need perfection, but you must be within safe ranges and know when to slow down or switch the route.

Assumption checklist by common methods

Linear regression (OLS)
  • Linearity: Relationship between predictors and outcome is approximately linear.
  • Independence: Errors are independent (no autocorrelation).
  • Homoscedasticity: Constant error variance across predictions.
  • Normality of errors: For valid t-tests/intervals with small samples.
  • No high multicollinearity: Predictors not nearly linear combinations of each other.
  • No high-influence anomalies: Outliers/leverage points not dominating fit.
Logistic regression
  • Correct link and specification (logit for binary outcome).
  • Independent observations (unless modeled otherwise).
  • No extreme separation (or use remedies like regularization/Firth).
  • Reasonable multicollinearity levels.
  • Adequate calibration and discrimination.
t-tests and ANOVA
  • Independence between observations.
  • Normality of group residuals (especially at small n).
  • Equal variances across groups (for classic tests; Welch handles inequality).
Time series models
  • Stationarity (or modeled trends/seasonality).
  • Residuals uncorrelated and roughly homoscedastic.

Diagnostics toolbox

  • Plots: Residuals vs fitted, Scale–Location, QQ plot, leverage/Cook's distance, calibration curve, ROC, ACF/PACF.
  • Tests: Breusch–Pagan/White (heteroscedasticity), Durbin–Watson (autocorrelation), Shapiro–Wilk (normality), Levene/Brown–Forsythe (variance), Hosmer–Lemeshow (calibration).
  • Stats: VIF for multicollinearity; Brier score for calibration; AUC/PR for discrimination.

How to run diagnostics (practical steps)

  1. Fit baseline: Start with a simple, interpretable model. Save residuals/predicted values.
  2. Visual triage: Residuals vs fitted, QQ plot. Look for patterns/funnels/heavy tails.
  3. Targeted tests: Based on visuals, run heteroscedasticity tests, Durbin–Watson, Shapiro–Wilk, Levene, etc.
  4. Influence checks: Leverage, Cook’s distance. Investigate data quality on flagged points.
  5. Collinearity: Compute VIF. Address with feature engineering or regularization.
  6. Model suitability: For classification, examine calibration and AUC/PR; for time series, check ACF/PACF of residuals.
  7. Remedies: Transform variables, add interactions, use robust/clustered SEs, regularize, or switch models. Re-run diagnostics.

Worked examples

Example 1: Linear regression with issues

Scenario: Predicting revenue from ad spend and season. Diagnostics show: funnel-shaped residuals, Durbin–Watson = 1.1, VIF for two spend channels = 9.5, two points with Cook’s D > 0.5.

  • Interpretation: Heteroscedasticity, positive autocorrelation, multicollinearity, influential points.
  • Remedies: log-transform revenue or use robust SEs; model autocorrelation (e.g., include lagged residuals or move to time-series regression); combine correlated channels or regularize; investigate and possibly winsorize or correct data issues for influential points.
  • Re-check: After fixes, residuals random around zero, DW ≈ 2, VIF < 5.
Example 2: Two-sample test under unequal variances

Scenario: Compare conversion rates (as continuous proxy) between variants with differing variance. Levene’s test p = 0.01.

  • Interpretation: Variances unequal; classic pooled t-test invalid.
  • Remedies: Use Welch’s t-test. If heavy non-normality and small n, use Mann–Whitney as a robustness check.
  • Decision: Report Welch’s estimate and CI; confirm with bootstrap CI.
Example 3: Logistic regression diagnostics

Scenario: Churn model. AUC = 0.83, Brier score = 0.17, calibration curve underpredicts high-risk customers. Some complete separation on a rare feature.

  • Interpretation: Good discrimination, calibration drift at high risk, potential separation.
  • Remedies: Apply calibration (Platt scaling or isotonic), consider Firth or L2 regularization for separation, review rare feature encoding.
  • Re-check: Improved Brier, calibration curve close to diagonal; coefficients stable.

Hands-on exercises

Try the exercise below. Then compare with the provided solution.

  • Checklist before you answer:
    • State which assumptions are violated.
    • List at least three concrete remedies.
    • Mention how you would re-check after fixes.

Common mistakes and self-check

  • Mistake: Treating normality of residuals as required for unbiased coefficients in OLS. Self-check: It’s needed mainly for small-sample inference; exogeneity is key for unbiasedness.
  • Mistake: Ignoring autocorrelation in time-ordered data. Self-check: Always examine residual ACF/Durbin–Watson when data are sequential.
  • Mistake: Dropping variables due to high p-values without checking multicollinearity. Self-check: Inspect VIF first; consider regularization.
  • Mistake: Optimizing AUC only, ignoring calibration. Self-check: Inspect calibration curves/Brier score.
  • Mistake: Deleting outliers blindly. Self-check: Investigate data quality; prefer robust methods or justified winsorization.

Practical projects

  • Retail demand regression: Diagnose and fix heteroscedasticity and multicollinearity; compare OLS vs. OLS with robust SE vs. Ridge.
  • Churn classifier: Evaluate discrimination and calibration; apply calibration method and measure improvement.
  • A/B analysis: Simulate non-constant variance and autocorrelation; compare classic t-test vs. Welch vs. block/cluster-robust SEs.

Learning path

  • Review regression assumptions and residual plots.
  • Learn heteroscedasticity and autocorrelation tests.
  • Practice VIF and influence diagnostics; try regularization.
  • Expand to classification calibration and time-series residual checks.
  • Consolidate with a mini project and quick test.

Next steps

  • Run diagnostics on one of your past analyses; document issues and fixes.
  • Adopt a standard diagnostic checklist for every model.
  • Compare conclusions with and without appropriate fixes.

Quick test

Take the quick test to check understanding. Available to everyone; only logged-in users get saved progress.

Mini challenge

You inherit a model predicting weekly sales. Residual vs fitted shows a clear wave pattern; ACF has significant spikes at lags 1 and 52; VIFs are all below 3. In one paragraph, propose your next three actions and how you will verify improvements.

Practice Exercises

1 exercises to complete

Instructions

You fit an OLS model predicting monthly revenue from five marketing channels and seasonality dummies on 48 months of data. Diagnostics:

  • Residuals vs fitted: funnel shape, larger spread at high predictions.
  • Scale–Location plot: upward trend.
  • QQ plot: mild S-shape, heavier tails.
  • Durbin–Watson = 1.10; Residual ACF at lag 1 = 0.35 (significant).
  • Breusch–Pagan p = 0.002.
  • VIFs: X1 = 2.4, X2 = 9.8, X3 = 1.9, X4 = 10.3, X5 = 2.1.
  • Two points: high leverage; Cook’s D = 0.62 and 0.55.
  • Shapiro–Wilk p = 0.06.

Tasks:

  • Identify which assumptions are violated or at risk.
  • List at least three concrete, defensible remedies.
  • Explain how you will re-check after changes.
Expected Output
Clear identification of heteroscedasticity, autocorrelation, multicollinearity, and influential points; minor concern about normality only for inference. Remedies include robust/clustered SEs or transformation, time-series aware model (e.g., AR errors), addressing collinearity (feature combination or regularization), and investigating/remediating influential points. Re-run diagnostics and compare metrics/plots.

Statistical Assumptions And Diagnostics — Quick Test

Test your knowledge with 10 questions. Pass with 70% or higher.

10 questions70% to pass

Have questions about Statistical Assumptions And Diagnostics?

AI Assistant

Ask questions about this tool