What an Analytics Engineer does
An Analytics Engineer builds reliable, documented, and tested data models that power metrics, dashboards, and decision-making. You sit between data engineering and analytics: you transform raw data into clean, business-ready tables and semantic layers, usually in a cloud data warehouse using SQL and dbt.
Typical day-to-day
- Translate business questions into data models and metrics definitions.
- Write SQL/dbt models that follow good data modeling patterns (star schemas).
- Add tests, documentation, and version control to data code.
- Schedule and monitor pipelines with an orchestrator.
- Review PRs, fix data quality issues, and communicate changes to stakeholders.
- Partner with BI developers and analysts to ensure dashboards use governed metrics.
Typical deliverables
- Clean, documented fact and dimension tables for analytics.
- dbt projects with tests (unique, not_null, accepted_values) and snapshots.
- Clear metric definitions (grain, filters, windowing, attribution).
- Runbooks, data SLAs, and alerting for critical datasets.
- Performance-optimized warehouse queries and models.
Who this is for
- Aspiring data professionals who enjoy both engineering discipline and business impact.
- Analysts who want to level up into robust, code-driven data modeling.
- Software/ETL engineers who want to own analytics-ready data and metrics.
Prerequisites
- Comfortable with basic SQL (SELECT, JOIN, GROUP BY, WHERE).
- Basic command line and Git familiarity.
- High-level understanding of how data moves from sources to a warehouse.
Nice-to-have but not required
- Experience with a cloud warehouse (BigQuery, Snowflake, Redshift).
- Some Python knowledge for orchestration or utilities.
- Business context in marketing, product, finance, or operations.
Hiring expectations by level
Junior
- Can write solid SQL and follow a modeling pattern with guidance.
- Uses dbt basics: models, refs, documentation, and simple tests.
- Needs support designing schemas and writing robust tests.
- Delivers tasks with clear acceptance criteria and code reviews.
Mid
- Designs star schemas; defines model contracts and data SLAs.
- Implements incremental models and performance tuning strategies.
- Builds orchestrated pipelines and adds alerting and observability.
- Leads stakeholder discovery and translates to stable metrics.
Senior
- Owns the modeling layer and standards across domains.
- Architects quality frameworks, lineage, and governance.
- Coaches others; reviews for design, performance, and test coverage.
- Partners with leadership on roadmap, trade-offs, and data adoption.
Salary ranges
- Junior: $55k–85k
- Mid: $80k–120k
- Senior: $110k–160k+
- Lead/Staff: $140k–200k+
- Freelance/Contract: $50–120 per hour
Varies by country/company; treat as rough ranges.
Where you can work
- Industries: SaaS, e-commerce, fintech, health, logistics, marketplaces, gaming, public sector.
- Teams: Data/Analytics, BI/Reporting, Product Analytics, RevOps, Finance Analytics.
- Company sizes: Startups (own the stack), mid-size (domain ownership), enterprise (governance and scale).
Learning path
- SQL fundamentals → joins, window functions, aggregates.
- Data modeling → star schemas, grains, slowly changing dimensions.
- dbt concepts → models, refs, tests, docs, snapshots, incremental.
- ETL/ELT patterns → staging, marts, idempotency.
- Version control → Git branching, PRs, code reviews.
- Testing & data quality → assertions, freshness, contracts.
- Orchestration → task dependencies, retries, SLAs.
- Performance tuning → partitions, clustering, indexes, costs.
- Documentation → model docs, runbooks, change logs.
Mini task: your first modeling backlog
- Pick one business question (e.g., weekly active users).
- Define the grain, source tables, and metrics logic.
- List 3 tests to ensure trust.
Practical portfolio projects
- Event analytics mart
Outcome: A dbt project that transforms raw event logs into a star schema (fact_events + dim_users + dim_sessions) with documentation and tests.
Include: incremental model for events, freshness tests, and a metrics table. - Sales and marketing attribution
Outcome: A modeled funnel from leads to revenue with a clear attribution rule and acceptance tests.
Include: one snapshot for SCD Type 2 on lead status. - Data quality dashboard
Outcome: A dashboard showing completeness, freshness, and failure rates of key models.
Include: alerts when SLAs break and a runbook for incident response. - Warehouse performance makeover
Outcome: Before/after benchmarks for heavy queries, with partitioning/clustering or indexes and cost reduction notes.
Include: a short write-up of trade-offs. - Metrics layer prototype
Outcome: Canonical definitions of 5 core KPIs with grain, filters, and calculation logic, linked to underlying models.
Include: tests for metric validity (e.g., sum of components equals total).
Tips to make projects stand out
- Show diffs and PRs with review comments.
- Explain assumptions and edge cases in docs.
- Add tests and monitoring, not just transformations.
Skill map
- SQL — Your primary language for modeling and analysis; enables joins, window logic, and metric aggregations.
- Data Modeling (Dimensional/Star) — Produces intuitive, performant schemas for BI and self-serve analytics.
- dbt Concepts — The standard toolkit for modular, testable, documented transformations.
- ETL/ELT Patterns — Structures pipelines from raw staging to curated marts.
- Version Control (Git) — Ensures collaboration, code reviews, and safe releases.
- Testing Data Pipelines — Builds trust through assertions, contracts, and CI checks.
- Documentation — Speeds onboarding and reduces accidental metric drift.
- Orchestration Basics — Delivers reliable, observable runs with SLAs and retries.
- Data Quality Frameworks — Systematically prevents and detects issues.
- Warehouse Performance Tuning — Reduces costs and latency for large datasets.
Mini task: draft a model contract
Write a short spec for a fact table:
- Grain: one row per order_id per day
- Primary key: order_id, order_date
- Not null: order_id, order_date, customer_id, amount
- Accepted values: status in {placed, shipped, delivered, returned}
Interview preparation checklist
- Explain the difference between staging models and marts.
- Design a star schema for a real scenario (orders, products, customers).
- Write a SQL query using window functions (running totals, deduplication).
- Choose a dbt materialization for a given workload and justify.
- Define a metric’s grain, filters, and edge cases clearly.
- Describe your data quality approach: tests, monitoring, SLAs.
- Walk through a production incident and how you prevented recurrence.
- Talk about versioning, branching, and code review standards.
- Show performance tuning tactics with before/after results.
Practice prompts
- How would you ensure Looker/Tableau uses governed metrics?
- What alerts would you set for a daily revenue model?
- How do you handle late-arriving data in incremental models?
Common mistakes and how to avoid them
- Overfitting models to one dashboard → Model by domain and grain; keep transformations reusable.
- Skipping tests and docs → Add tests and short docs alongside every model change.
- Ignoring performance → Partition/cluster heavy tables; monitor costs and cache usage.
- Metric drift → Centralize metric definitions and review changes via PRs.
- Weak stakeholder alignment → Run a metrics kickoff: definitions, edge cases, and acceptance criteria.
- One big daily job → Break into dependencies with retries and alerting.
Mini task: quality guardrails
- List 3 critical models and add not_null/unique tests.
- Set freshness thresholds and alerts.
- Create a simple incident runbook.
Next steps
Pick one skill below, set a one-week goal, and ship a small project end-to-end. Then take the exam to benchmark your readiness. You can do this for free; if you log in, your progress will be saved.
Ready? Pick a skill to start in the Skills section.