How to learn Naming Conventions for Physical Data Modeling And Storage Design in Data Architect for free

Why this matters

Naming conventions make data discoverable, code reviewable, and systems maintainable. As a Data Architect, you define names for schemas, tables, columns, files, partitions, topics, indexes, and constraints. Good names prevent production bugs, simplify onboarding, and accelerate analytics.

Real task: Standardize names across a new warehouse (dim_ and fct_ tables, clear column units).
Real task: Define S3/ADLS paths and partition folders that tools can infer automatically.
Real task: Enforce constraint/index naming so incident triage is fast and unambiguous.

Concept explained simply

Names are contracts between people and systems. They should be predictable, stable, and self-explanatory. Pick rules once, then apply everywhere.

Mental model

Think of every name as an API endpoint: short, consistent, and purpose-revealing. If someone can guess a name without asking you, your convention is working.

Core rules and patterns

Case and separators: Use lowercase snake_case for most databases, files, and streams. Avoid spaces and mixedCase.
Singular vs. plural: Tables often plural (dim_customer, fct_orders). Columns singular (customer_id, order_amount).
Prefixes for object role: dim_, fct_ (or fct_), stg_ (staging), rpt_ (reporting marts). Pick one pattern and use it everywhere.
Column clarity: Include units, grain, and role. Examples: amount_usd, temperature_c, created_at_utc, is_active, customer_id.
Time grain suffixes: _h, _d, _wk, _m, _q, _y for hourly/daily/weekly/monthly/quarterly/yearly facts.
SCD type suffix: dim_customer_scd2 for Type 2. Avoid vague names like dim_customer_history.
Constraints/indexes: pk_, fk__, uq_
_, ck_
_, idx_
_
.
Environment tagging: Put env in higher-level containers (schemas/dbs/paths), not in business table names. Example schemas: prd_dw, stg_dw. Tables stay dim_customer, fct_orders.
Reserved words: Avoid keywords (order, group). If unavoidable, qualify: order_id instead of order.
Length and ASCII: Stay short but clear; stick to ASCII for portability.
Abbreviations: Only common, documented ones (qty, amt, id, usd). Avoid cryptic custom abbreviations.
Data lake layout: zone/domain/dataset/grain/partition=val. Use key=value folders for partitions (dt=YYYY-MM-DD, year=YYYY/month=MM/day=DD).
Streaming topics: env.team.domain.entity.event.vN or env.domain.entity.vN. Keep consistent ordering.
Worked examples
Example 1: Warehouse tables and columns
```
-- Schema (environment at schema level, not table name)
prd_dw

-- Tables
prd_dw.dim_customer_scd2
prd_dw.fct_orders_d

-- Columns
-- dim_customer_scd2
customer_id (business key or surrogate if named customer_sk)
customer_sk (surrogate key)
first_name
last_name
email_hash_sha256 (PII handled)
effective_from_utc
effective_to_utc
is_current

-- fct_orders_d
order_id
customer_sk
order_amount_usd
order_created_at_utc
order_date (date)

-- Constraints & indexes
pk_dim_customer_scd2 (customer_sk)
fk_fct_orders_d_customer_sk (customer_sk -> dim_customer_scd2.customer_sk)
idx_fct_orders_d_order_date (order_date)
uq_dim_customer_scd2_bk (customer_id, effective_to_utc)
```
Example 2: Data lake/ADLS paths
```
-- Pattern: <zone>/<domain>/<dataset>/<grain>/dt=YYYY-MM-DD/part=<value>/
prd/finance/orders/d/dt=2024-06-30/region=us/parquet/

-- File names
part-0001-0b.parquet (do not encode business meaning in file name; meaning is in the path)

-- Alternative partitioning
prd/marketing/campaign_events/h/year=2024/month=06/day=30/hour=14/
```
Example 3: Streaming topics
```
-- Pattern: env.team.domain.entity.event.vN
prd.core.sales.order.created.v1
stg.core.sales.order.updated.v1

-- Schema registry subjects align with topic and version
order_created_value.v1
```
Example 4: Staging and reporting
```
stg_dw.stg_orders_raw
stg_dw.stg_customers_clean
rpt_dw.rpt_sales_summary_m
```
Step-by-step: define your standard
1. Choose casing and character set. Default: lowercase snake_case, ASCII only.
2. Decide scope placement. Environments (prd, stg) live in schemas/paths, not table names.
3. Table classes. Pick prefixes (dim_, fct_, stg_, rpt_) and write 1–2 examples for each.
4. Column rules. Keys (_id, _sk), booleans (is_/has_), timestamps (_at_utc), units (_usd, _cm), grain columns (order_date).
5. Constraints and indexes. pk_, fk_, uq_, ck_, idx_ with table and column hints.
6. Lake layout. Define zone/domain/dataset/grain and partition key=value folders.
7. Streaming topics. Fix the token order and versioning strategy.
8. Abbreviations list. Approve common ones; ban unclear ones.
9. Examples + linter checklist. Document 10 canonical examples and a PR checklist.
Copy-ready checklist
- Lowercase snake_case everywhere
- Tables: dim_, fct_, stg_, rpt_ prefixes
- Columns include units/grain where relevant
- Timestamps end with _utc
- Booleans start with is_/has_
- Constraints: pk_/fk_/uq_/ck_ + table + columns
- Indexes: idx_ + table + columns
- Partitions: key=value folders (dt=YYYY-MM-DD)
- Env in schema/path, not table names
- Reserved words avoided
Exercises (do this now)
Exercise 1: Warehouse naming refactor
Rename the following to match the rules above. Include tables, columns, and constraints.
```
Schema: production_dw
Tables:
CustomerHistory, OrdersDaily
Columns:
CustomerHistory: ID, FirstName, LastName, Email, From, To, Current
OrdersDaily: OrderID, CustKey, Amount, CreatedAt, Date
Constraints:
PK on CustomerHistory(ID)
FK on OrdersDaily(CustKey) references CustomerHistory(ID)
```
Need a hint?
- Use dim_customer_scd2 and fct_orders_d.
- Add units and time zone to columns where needed.
Exercise 2: Lake and streaming layout
Design a path and topic names:
```
Goal: store daily orders by region and expose created events.
Current path idea: /prod/orders/20240630/us/
Current topic idea: ordersCreated
```
Need a hint?
- Use zone/domain/dataset/grain and key=value partitions.
- Include env and version in topics.
Self-check checklist
- Lowercase snake_case used consistently
- Table prefixes clarify role (dim_, fct_, stg_, rpt_)
- Columns carry units, UTC timestamps, and clear roles
- Constraints/indexes named with pk_/fk_/idx_ patterns
- Partitions use key=value (dt=YYYY-MM-DD)
- Env appears in schemas/paths or topic tokens, not business table names
Common mistakes and self-check
- Mixing cases and separators. Fix: enforce lowercase snake_case linters in PRs.
- Putting env in table names. Fix: keep env at schema/path level.
- Ambiguous columns (amount, created_at). Fix: add units and timezone: amount_usd, created_at_utc.
- Forgetting grain. Fix: suffix fact names with _h/_d/_m.
- Vague history tables. Fix: scd suffix: dim_customer_scd2.
- Cryptic abbreviations. Fix: maintain an approved abbreviation list.
- Ignoring constraints/index names. Fix: standardize pk_/fk_/idx_ patterns for alerts and tooling.
- Using reserved words. Fix: append role (order_id) or rename entirely.
Quick self-audit
Pick three objects at random. Can a teammate unfamiliar with the system guess their purpose and relationships? If not, rename.
Practical projects
- Create a one-page naming standard with 10 before/after examples and apply it to a toy star schema (2 dims, 1 fact).
- Migrate a small dataset in a data lake to key=value partitions and validate tool compatibility.
- Define streaming topic tokens for 5 core events and simulate a version bump (v1 to v2).
Who this is for
- Data Architects defining enterprise data standards
- Data Engineers implementing warehouses/lakes/streams
- Analytics Engineers building marts and reports
Prerequisites
- Basic SQL and DDL (schemas, tables, constraints)
- Familiarity with data lake folders and partitions
- Awareness of event streaming concepts
Learning path
- Start: Naming conventions (this page)
- Next: Data types and units, partitioning strategy, indexing strategy
- Then: Governance and documentation practices
Next steps
- Adopt the checklist in your next PR
- Refactor a small domain to the standard
- Share the one-page standard with your team
Mini challenge
In 20 minutes, write names for: a Type 2 customer dimension, a daily orders fact, an hourly clickstream fact, an orders_created topic, and a lake path for finance invoices by month. Keep to the rules above.
Take the Quick Test
Ready to check your understanding? Take the Quick Test below. Available to everyone; if you are logged in, your progress will be saved.
Practice Exercises
2 exercises to complete
Instructions
Rename the following objects to a clean standard. Include tables, columns (with units/timezone where relevant), and constraints/indexes.
Schema: production_dw Tables: CustomerHistory, OrdersDaily Columns: CustomerHistory: ID, FirstName, LastName, Email, From, To, Current OrdersDaily: OrderID, CustKey, Amount, CreatedAt, Date Constraints: PK on CustomerHistory(ID) FK on OrdersDaily(CustKey) references CustomerHistory(ID)
Expected Output
A consistent schema (e.g., prd_dw), tables dim_customer_scd2 and fct_orders_d, columns in snake_case with units/timezone, and pk_/fk_/idx_/uq_ names.
Naming Conventions — Quick Test
Test your knowledge with 8 questions. Pass with 70% or higher.
8 questions70% to pass
Multi Tenant Data Isolation Complete Skill
Have questions about Naming Conventions?
AI Assistant
Ask questions about this tool
Our Team Careers About Us Privacy Policy Terms of Use
YouTube Pinterest Telegram X GitLab Instagram
© 2026 LuvvHelp. All rights reserved.

Menu

Naming Conventions

Table of Contents

Why this matters

Concept explained simply

Mental model

Core rules and patterns

Worked examples

Example 1: Warehouse tables and columns

Example 2: Data lake/ADLS paths

Example 3: Streaming topics

Example 4: Staging and reporting

Step-by-step: define your standard

Exercises (do this now)

Exercise 1: Warehouse naming refactor

Exercise 2: Lake and streaming layout

Self-check checklist

Common mistakes and self-check

Practical projects

Who this is for

Prerequisites

Learning path

Next steps

Mini challenge

Take the Quick Test

Practice Exercises

Warehouse naming refactor

Instructions

Expected Output

Lake and streaming layout

Naming Conventions — Quick Test

Have questions about Naming Conventions?

AI Assistant