luvv to helpDiscover the Best Free Online Tools
Topic 3 of 8

Metrics For Segmentation Iou Dice

Learn Metrics For Segmentation Iou Dice for free with explanations, exercises, and a quick test (for Computer Vision Engineer).

Published: January 5, 2026 | Updated: January 5, 2026

Who this is for

Computer Vision Engineers and ML practitioners who build image/instance/semantic segmentation systems and need dependable, comparable metrics to evaluate models.

Prerequisites

  • Basic confusion-matrix terms: true positive (TP), false positive (FP), false negative (FN)
  • Understanding of binary vs multi-class segmentation masks
  • Ability to threshold model probabilities into binary masks

Why this matters

Real tasks you will face:

  • Choosing a consistent metric to compare segmentation models across datasets and classes
  • Deciding thresholding and averaging strategies for class-imbalanced data
  • Handling edge cases (empty masks) without breaking dashboards or CI checks
  • Diagnosing failure modes (over-segmentation vs under-segmentation) using TP/FP/FN patterns

Concept explained simply

Intersection over Union (IoU) and Dice coefficient measure overlap between prediction and ground-truth masks.

  • IoU (Jaccard): IoU = TP / (TP + FP + FN)
  • Dice (F1/Dice): Dice = 2TP / (2TP + FP + FN)
  • Relation: Dice = 2 × IoU / (1 + IoU). Both range from 0 (no overlap) to 1 (perfect overlap).

Mental model

Think of two shapes on a canvas. IoU is overlap area divided by the total area covered by either shape. Dice is like a negotiated overlap score that rewards mutual agreement slightly more, often smoother and more forgiving of small boundary shifts.

When to favor each
  • IoU: Common benchmark metric; stricter for small mismatches
  • Dice: Often used as a loss or validation metric; smoother with small objects or fuzzy boundaries

Practical details you must get right

  • Thresholding: Convert probabilities to binary masks (e.g., p >= 0.5). For comparison fairness, report the chosen threshold or sweep over thresholds.
  • Averaging across classes:
    • Macro: average the metric per class equally
    • Weighted macro: weight by class frequency
    • Micro: compute global TP/FP/FN across all classes first, then compute the metric
  • Empty-mask cases:
    • If ground truth and prediction are both empty: define IoU=1, Dice=1 (perfect agreement)
    • If ground truth empty but prediction not: IoU=0, Dice=0 (false positive)
  • Soft vs hard metrics: For training or calibration, you may use soft Dice with probabilities; for reporting, prefer thresholded hard masks for clarity.
  • Smoothing epsilon: When denominators can be 0, add a tiny value (e.g., 1e-7) to avoid division by zero.

Worked examples

Example 1 — Binary segmentation

Given TP=1200, FP=300, FN=500:

  • IoU = 1200 / (1200 + 300 + 500) = 1200 / 2000 = 0.60
  • Dice = 2×1200 / (2×1200 + 300 + 500) = 2400 / 3200 = 0.75
  • Relation check: 2×0.60 / (1 + 0.60) = 1.20 / 1.60 = 0.75
Example 2 — Empty ground truth and prediction
  • GT empty, Pred empty → IoU=1, Dice=1 (perfect agreement)
  • GT empty, Pred not empty → IoU=0, Dice=0 (all predicted pixels are FP)
Example 3 — Multi-class (macro and micro)

Three classes (ignore background). Per-class TP, FP, FN:

  • Class A: TP=50, FP=10, FN=20 → IoU=50/80=0.625; Dice=100/(100+30)=0.769
  • Class B: TP=30, FP=15, FN=15 → IoU=30/60=0.500; Dice=60/(60+30)=0.667
  • Class C: TP=40, FP=20, FN=40 → IoU=40/100=0.400; Dice=80/(80+60)=0.571

Macro mIoU = (0.625 + 0.500 + 0.400)/3 = 0.508

Macro mDice = (0.769 + 0.667 + 0.571)/3 = 0.669

Micro totals: TP=120, FP=45, FN=75 → IoU=120/240=0.500; Dice=240/(240+120)=0.667

How to compute IoU and Dice (step-by-step)

  1. Prepare binary masks (per class if multi-class): threshold probabilities consistently.
  2. Count TP, FP, FN pixel-wise for each class.
  3. Compute IoU and Dice from counts. Add a small epsilon in denominators if needed.
  4. Choose averaging: macro, weighted, or micro. State the choice in reports.
  5. Handle empty-mask cases with clear conventions.
  6. Optionally sweep thresholds to see stability and choose an operating point.

Common mistakes and how to self-check

  • Mixing background with foreground classes: Decide if background is a class; be consistent across training and evaluation.
  • Reporting only a single number for a heavily imbalanced dataset: Also share per-class metrics or macro averages.
  • Ignoring threshold sensitivity: Validate metrics at multiple thresholds or use PR/ROC analysis.
  • Silent divide-by-zero: Always use epsilon; log cases where both masks are empty.
  • Comparing soft Dice to hard IoU: Keep apples-to-apples; use the same mask type.
Self-check
  • Can you explain the difference between macro, weighted macro, and micro?
  • Do you have a written rule for empty-mask handling?
  • Are thresholds and class mappings documented?

Exercises (hands-on)

Do these now. Then compare with the solutions below or in the exercises card.

Exercise 1 — Binary IoU and Dice

Given a binary segmentation task with TP=1200, FP=300, FN=500, compute IoU and Dice to two decimals.

  • Show both the formula and your intermediate denominator values.

Exercise 2 — Multi-class macro and micro

For three classes (ignore background) with per-class counts:

  • Class A: TP=50, FP=10, FN=20
  • Class B: TP=30, FP=15, FN=15
  • Class C: TP=40, FP=20, FN=40

Compute macro mIoU and mDice. Then compute micro IoU and Dice using totals across classes. Round to three decimals.

Checklist before you check solutions
  • Wrote the exact formulas used
  • Showed denominators for IoU and Dice
  • Stated rounding rules
  • Explained whether background is included

Practical projects

  • Build a segmentation evaluation script: Given GT and predicted masks, output per-class IoU/Dice, macro/micro averages, and threshold sweep results.
  • Error analysis dashboard: Visualize FP hot spots by overlaying masks and sorting images by lowest IoU.
  • Calibration study: Compare metrics at thresholds from 0.3 to 0.7 and pick an operating point that balances FP and FN for your use case.

Learning path

  • Before this: Confusion matrix fundamentals; segmentation basics
  • Now: IoU and Dice metrics, averaging choices, and edge cases
  • Next: Calibration, precision/recall curves for segmentation, and panoptic metrics if needed

Next steps

  • Compute both IoU and Dice for your current project and compare macro vs micro trends
  • Decide and document your empty-mask policy
  • Run a small threshold sweep and plot metric vs threshold

Mini challenge

You are evaluating a medical segmentation model with many tiny lesions and severe class imbalance. What metric and averaging would you report as the main number, and what two supporting plots would you share with stakeholders? Justify briefly.

When you are ready, take the Quick Test below. Everyone can take it for free; only logged-in users will have their progress saved.

Practice Exercises

2 exercises to complete

Instructions

Binary segmentation with TP=1200, FP=300, FN=500.

  • Compute IoU and Dice
  • Show denominators and round to two decimals
Expected Output
IoU = 0.60, Dice = 0.75

Metrics For Segmentation Iou Dice — Quick Test

Test your knowledge with 6 questions. Pass with 70% or higher.

6 questions70% to pass

Have questions about Metrics For Segmentation Iou Dice?

AI Assistant

Ask questions about this tool