What Applied Scientists do
Applied Scientists turn research ideas into real product impact. You frame ambiguous problems, design metrics and experiments, build ML models, and work with engineers to ship and iterate. You own the scientific rigor and the business outcome.
Day-to-day
- Clarify the objective and constraints with product and engineering.
- Audit data, define labels, and design an evaluation plan.
- Train baseline and advanced models; run ablations and error analysis.
- Propose, run, and analyze A/B tests; recommend rollout decisions.
- Collaborate on feature pipelines, inference efficiency, and monitoring.
- Write decision docs, model cards, and experiment reports.
Typical deliverables
- Problem framing doc with hypotheses and success metrics.
- Reproducible notebooks and scripts; baselines and comparisons.
- Offline evaluation reports and online experiment analyses.
- Data/labeling strategy and quality checks.
- Model card (intended use, limitations, fairness, monitoring).
Peek into a week
- Mon: Align on goal and constraints; finalize metrics and guardrails.
- Tue: Data audit, labeling plan, leakage checks.
- Wed: Train baselines; run ablations; error analysis.
- Thu: Prepare A/B test; write decision doc; review with stakeholders.
- Fri: Optimize latency; add monitoring; plan next iteration.
Hiring expectations by level
Junior / Entry
- Can frame a well-scoped problem with guidance.
- Implements baselines; understands evaluation metrics.
- Comfortable with notebooks, versioning, and data hygiene.
- Communicates results clearly to the team.
Mid-level
- Owns an end-to-end project from framing to experiment and rollout.
- Selects appropriate models; designs ablations and guardrails.
- Partners effectively with engineers on data and inference.
- Balances offline metrics with business impact.
Senior / Staff
- Leads ambiguous, cross-team initiatives and sets scientific standards.
- Shapes metrics, experimentation culture, and roadmap.
- Improves efficiency, reliability, and responsible AI practices.
- Mentors scientists; influences product strategy with evidence.
Salary ranges
- Junior: ~$90k–$140k
- Mid-level: ~$130k–$200k
- Senior/Staff: ~$180k–$300k+
Varies by country/company; treat as rough ranges.
Where you can work
- Industries: Search, ads, e-commerce, marketplaces, fintech, health, media, logistics, robotics, autonomous systems, cloud platforms.
- Teams: Ranking/Search, Recommendations, NLP, Vision, Personalization, Risk/Fraud, Ops optimization, ML Platform.
Skill map for Applied Scientists
Research Problem Framing
Translate ambiguous goals into a measurable ML problem with clear hypotheses and constraints.
- Define target, metric, and success criteria.
- Identify levers, risks, and guardrails.
Applied ML Modeling
Build, compare, and refine models from strong baselines to advanced architectures.
- Choose features/representations; regularize; calibrate.
- Run ablations and error buckets.
Data And Label Strategy
Design labeling, stratified sampling, and quality controls to ensure signal.
- Active learning; inter-annotator agreement; drift checks.
Experimentation And Evaluation
Measure what matters with offline metrics and online tests.
- A/B tests, CUPED, sequential testing, power analysis.
Optimization And Efficiency
Ship models that are fast, reliable, and cost-aware.
- Profiling, quantization, distillation, caching, batching.
Production Collaboration
Work with engineers to deploy safely and monitor effectively.
- Feature pipelines, schemas, contracts, alerts, rollback plans.
Scientific Communication
Write crisp decision docs and tell the story with evidence.
- Assumptions, risks, trade-offs, and recommendations.
Responsible AI Practices
Reduce harm and bias; document limitations and mitigation.
- Fairness metrics, privacy, transparency, and model cards.
Who this is for and prerequisites
Who this is for
- Analytical builders who enjoy turning ideas into shipped ML features.
- People comfortable mixing research depth with product pragmatism.
- Collaborators who like working across product, data, and engineering.
Prerequisites
- Comfort with Python, data manipulation, and version control.
- Statistics basics: hypothesis testing, confidence intervals, regression.
- ML basics: supervised learning, overfitting, cross-validation.
- Willingness to write and present clear technical documents.
Learning path
- Frame problems well — Practice turning product goals into measurable ML tasks with success metrics and guardrails.
- Data and labels — Design sampling/labeling strategies; check for leakage and drift.
- Strong baselines — Start simple; compare systematically; run ablations and calibration.
- Evaluate rigorously — Connect offline metrics to online impact; plan A/B tests with power.
- Optimize for production — Meet latency and cost; add monitoring and rollback plans.
- Communicate decisions — Write concise docs that move decisions forward.
- Ship responsibly — Incorporate harm analysis, fairness checks, and model cards.
Mini task: From goal to metric
Pick a goal (e.g., increase relevant clicks). Write:
- Primary metric and guardrails.
- Hypothesis and expected effect size.
- Offline proxy and leakage risks.
Practical projects
1) Search ranking baseline → production-ready
Build BM25 + learning-to-rank baseline, define NDCG@K, run error buckets, propose an A/B plan.
- Outcome: Decision doc with metrics, ablations, and rollout plan.
2) Active learning label strategy
Create a small labeled set, train a classifier, use uncertainty sampling to select the next batch, measure label efficiency.
- Outcome: Label budget vs accuracy curve and label QA checklist.
3) Online metric design
Design a North Star metric with guardrails; simulate an A/B test with power analysis.
- Outcome: Experiment plan with hypothesis, power, MDE, and stop rules.
4) Latency and cost optimization
Profile inference, apply quantization or distillation, add caching.
- Outcome: Before/after latency, QPS, cost, and accuracy delta report.
5) Responsible AI fairness audit
Choose a model and compute group metrics; propose mitigations and document in a model card.
- Outcome: Model card with fairness metrics and limitations.
6) Cross-functional decision doc
Write a concise doc proposing an ML enhancement, with trade-offs and risks.
- Outcome: 1–2 page decision doc ready for review.
Interview preparation checklist
- Problem framing: turn ambiguous goals into measurable ML tasks.
- Modeling: baselines, regularization, calibration, error analysis.
- Experimentation: metrics, power analysis, guardrails, CUPED/sequential.
- System thinking: feature pipelines, inference paths, monitoring.
- Optimization: profiling, quantization, caching, batching.
- Responsible AI: fairness metrics and mitigations; model cards.
- Communication: crisp, structured answers with trade-offs and decisions.
- Portfolio: 2–3 concise projects with metrics and business outcomes.
Mock interview prompts
- Design an experiment to evaluate a new recommendation feature with tight guardrails.
- Reduce latency by 30% without losing more than 1% accuracy.
- Mitigate label noise in a high-variance dataset.
Common mistakes and how to avoid them
- Skipping problem framing → Always define target, metric, and guardrails first.
- Optimizing only offline metrics → Tie to online outcomes; plan A/B early.
- Data leakage → Audit features, time, and label sources rigorously.
- Weak baselines → Start simple; beat a strong, reproducible baseline.
- No reproducibility → Version data, code, and configs; fix random seeds.
- Ignoring latency/cost → Profile and set budgets; optimize before launch.
- No monitoring → Add alerts for drift, performance, and errors; plan rollback.
- Poor communication → Use clear, structured decision docs with trade-offs.
Next steps
Pick a skill to start in the Skills section below. Complete one practical project, then take the exam to check your readiness.