Topic Not Found

Why this matters

In computer vision, the model can only learn what you ask it to learn. A clear task and label schema prevents wasted annotation, enables consistent quality checks, and directly impacts model accuracy and cost. Real tasks include: identifying defects on a production line, counting people in retail footage, localizing vehicles for ADAS, segmenting tumors in medical scans, and classifying product images for e-commerce.

Typical professional tasks this unlocks

Translating business goals into a precise CV task (classification, detection, segmentation, keypoints, OCR).
Designing unambiguous label definitions and hierarchy.
Writing annotation guidelines with edge cases and examples.
Defining data splits, quality metrics, and inter-annotator agreement checks.
Piloting annotation and iterating schema to reduce noise.

Who this is for

Computer Vision Engineers and Data Scientists planning datasets.
Annotation leads and QA reviewers.
Product/PMs scoping ML features and acceptance criteria.

Prerequisites

Basic understanding of image data formats and datasets.
Familiarity with CV task types: classification, detection, segmentation, keypoints, OCR.
Basic model metrics: precision/recall, IoU, F1.

Concept explained simply

Defining the task and label schema means deciding exactly what the model must output and how humans will mark it in data. It converts a business question (e.g., ".find dents on cars.") into concrete labels (e.g., "bounding boxes for dents with severity: minor/major").

Mental model

Think of it like drafting a contract between the business, annotators, and the model:

Inputs: what images/videos are in scope and what is out of scope.
Outputs: exact label types and formats.
Rules: how to handle tricky cases and what to do when uncertain.
Quality: how success is measured and reviewed.

Picking the right task type

Quick guide

Classification: one label per image or multi-label per image. Use when presence/absence is enough.
Object Detection: bounding boxes for each instance. Use for counting and localization.
Instance Segmentation: precise pixel mask per object. Use when shape matters.
Semantic Segmentation: pixel mask per class (no instance IDs). Use when object identity is less important.
Keypoints/Pose: specific landmark coordinates. Use for pose, alignment, or measurements.
OCR: text localization and transcription. Use for documents, signs, plates.
Tracking: associate objects across frames. Use for video analytics.

Start with the minimum output needed to meet the business KPI. Simpler tasks cost less and annotate faster.

Designing the label schema

Define classes: exhaustive list, mutually exclusive where applicable. Provide definitions with positive and negative examples.
Attributes: optional or required properties (e.g., occluded: yes/no, severity: minor/major).
Hierarchy: parent/child relationships (Vehicle → Car/Truck/Bus).
Instance rules: when does a new instance start/end? How to merge/split overlapping items.
Spatial representation: box, polygon, mask, keypoints, line, or text region with transcription.
Uncertainty handling: allow "uncertain" or "ignore" regions to avoid forcing wrong labels.
Metadata: scene conditions (lighting, weather), annotator flags, and versioning.

Example label definition template

{
  "class": "Car",
  "definition": "A road vehicle designed primarily for passenger transport with four wheels.",
  "include": ["sedans", "hatchbacks", "SUVs"],
  "exclude": ["pickup trucks", "vans", "golf carts"],
  "representation": "bounding_box",
  "attributes": {
    "occluded": {"type": "boolean", "required": true},
    "truncated": {"type": "boolean", "required": true}
  },
  "edge_cases": [
    "Car behind fence → label as Car with occluded=true",
    "Half outside frame → truncated=true"
  ]
}

Label instructions and edge cases

Write concise instructions people can follow consistently.

Use must/should language and avoid ambiguity.
Include visual examples of correct and incorrect labels.
Call out common edge cases explicitly (small objects, occlusions, motion blur, reflections).
Define minimum size thresholds (e.g., ignore objects smaller than 12x12 px).
Define time-based rules for video (e.g., track identity persists across occlusion up to 10 frames).

Quality metrics and agreement

Detection: IoU thresholds (e.g., 0.5), precision/recall by class and size bucket.
Segmentation: mean IoU or Dice; boundary quality if relevant.
Classification: accuracy, F1, per-class confusion.
Agreement: inter-annotator agreement (IAA) via overlap metrics or Cohen’s kappa for classification.
Gold tasks: seed known examples for ongoing QA.

Lightweight QA flow

Pilot: annotate 50–200 samples with two annotators.
Measure IAA and error types (missing, wrong class, bad geometry).
Revise schema/instructions; repeat until stable.
Scale with spot-checks and gold tasks.

Worked examples

Example 1: Retail shelf compliance

Goal: Detect if each product is present and front-facing.
Task: Object detection with attributes.
Classes: {Product_A, Product_B, Product_C}.
Attributes: facing={front, angled, back}, occluded={yes/no}.
Ignore: objects under 1% of image area.
Metric: mAP@0.5, per-class.

Example 2: Road lane segmentation

Goal: Identify drivable area and lane markings.
Task: Semantic segmentation.
Classes: {drivable, lane_marking, curb, background}.
Rules: lane_marking only on paint; exclude shadows/reflections.
Metric: mIoU; boundary IoU for lane_marking.

Example 3: Face landmarks

Goal: Align faces for AR filters.
Task: Keypoints (68 landmarks) + visibility attribute.
Rules: if landmark occluded, set visibility=false and estimate location if reasonable; otherwise mark as missing.
Metric: NME (normalized mean error) over visible points.

Step-by-step: from problem to schema

State the decision you want to support (e.g., "alert staff when shelf gap exists").
Pick the simplest task that enables that decision.
List classes and attributes. Make them mutually exclusive or clearly multi-label.
Choose representation (box/polygon/mask/keypoints) and size thresholds.
Write edge-case rules and uncertainty policy.
Define metrics and acceptance thresholds.
Run a pilot, measure agreement, and iterate.

Exercises

Complete these in your own notes or a doc. The quick test below checks core concepts. Everyone can take the test; only logged-in users get saved progress.

Exercise 1: Supermarket shelf monitoring (mirror of ex1)

Design a label schema to detect and count three products on shelves and flag if any product is not front-facing.

Deliver: task type, classes, attributes, ignore rules, and metrics.
Consider: tiny products, reflections, occlusions, and similar packaging.

Exercise 2: Medical polyp detection (mirror of ex2)

Endoscopy video: detect and localize polyps, and mark uncertainty when visibility is poor.

Deliver: task type, representation, attributes, uncertainty handling, and QA plan.
Consider: motion blur, specular highlights, and tiny lesions.

Self-check checklist

Did you pick the simplest task that meets the goal?
Are classes exhaustive and non-overlapping, or clearly multi-label?
Do edge cases have explicit rules?
Is there a defined ignore/uncertain policy?
Do you have clear metrics and acceptance thresholds?
Did you plan a small pilot and IAA measurement?

Common mistakes and self-check

Too many classes: merge where decisions do not require the split.
Ambiguous definitions: add include/exclude examples and edge-case rules.
No uncertainty option: forces wrong labels; add uncertain/ignore.
Skipping pilot: disagreements stay hidden; always pilot and measure IAA.
Over-precise geometry: use boxes instead of polygons if shape is not needed.
Poor size thresholds: define minimum size to avoid noise.

Practical projects

Project 1: Create a 4-class traffic object detection schema with attributes (occluded, truncated). Pilot on 100 images and report IAA and error types.
Project 2: Build a semantic segmentation schema for indoor rooms (wall, floor, ceiling, window, door). Define ignore regions (mirrors, reflections) and measure mIoU on a validation split.
Project 3: OCR: define text region detection + transcription conventions (case, punctuation, illegible tag). Create 50 gold examples.

Learning path

Review task types and when to use each.
Draft label schema and instructions for a simple dataset.
Run a 50–200 sample pilot with dual annotation.
Measure IAA, revise schema, and document changes.
Scale annotation with periodic QA and gold tasks.

Next steps

Turn your exercise outputs into a one-page guideline doc.
Set acceptance metrics and thresholds for go/no-go.
Prepare a small gold set to monitor drift during scale-up.

Mini challenge

You are asked to detect road signs and also read the speed limit. Propose a two-stage labeling approach that balances cost and accuracy in 5 bullet points.

Take the Quick Test

Ready to check your understanding? Take the quick test below. Everyone can take it; only logged-in users will have results saved.

Menu

Defining Task And Label Schema

Table of Contents