Why this matters
Frequency tables are the fastest way to see how often values appear. As a Data Analyst, you will use them to:
- Summarize categorical survey responses (support channel, plan type, device).
- Profile datasets (top errors, most frequent product categories).
- Create grouped summaries for numeric variables (delivery times, order values) to prepare histograms.
- Spot outliers and rare categories before modeling or visualization.
Concept explained simply
A frequency table tallies how many times each value (or bin) occurs. You can add relative frequency (share of total) and cumulative frequency (running total).
Mental model: Imagine pouring data into labeled buckets. Each record drops into exactly one bucket. The count in each bucket is the frequency. Divide each bucket count by total rows to get relative frequency. Add them up as you go to get cumulative frequency.
How binning works for numbers
For numeric data, group values into intervals (bins). Use non-overlapping bins with clear boundaries. A common convention is lower bound inclusive, upper bound exclusive, for example:
- [0, 5), [5, 10), [10, 15), [15, 20)
This prevents double-counting at boundaries.
Choosing good bin sizes
- Keep bins equal width for a first pass (e.g., 5-minute wide).
- Use 5–10 bins for small-to-medium datasets; adjust as needed.
- For skewed data, consider narrower bins around dense regions and wider bins in tails.
- Ensure all data falls into exactly one bin (cover full range).
How to build a frequency table (step-by-step)
- Define categories or numeric bins. Write them down before counting.
- Tally counts. Go value by value; add 1 to the matching category/bin.
- Compute totals. Sum all counts to get N.
- Add relative frequency. For each row: count / N. Optionally show as a percentage.
- (Optional) Add cumulative frequency. For ordered categories or numeric bins, add running totals.
- Sort if helpful. For categorical, consider sorting by count descending.
Quickly doing this in a spreadsheet
- Put raw data in a single column.
- Use a pivot table: Rows = category/bin label, Values = COUNT of the data column.
- Add a total cell N = SUM of counts.
- Create a new column for Relative = count / N.
- If binned numeric: pre-create a helper column that maps value to a bin label (using IF or VLOOKUP-like logic) and pivot on that helper.
Worked examples
Example 1: Categorical (support channel)
Data (20 rows): Email, Chat, Phone, Chat, Email, Email, Social, Phone, Chat, Email, Chat, Email, Phone, Social, Chat, Email, Email, Phone, Chat, Email
- Email: 8
- Chat: 6
- Phone: 4
- Social: 2
- Total N: 20
Relative frequency (%): Email 40%, Chat 30%, Phone 20%, Social 10%.
Example 2: Numeric with bins (delivery time in minutes)
Data (18 rows): 3, 7, 12, 5, 9, 14, 16, 2, 8, 11, 13, 4, 18, 6, 15, 1, 10, 19
- [0, 5): 4
- [5, 10): 5
- [10, 15): 5
- [15, 20): 4
- Total N: 18
Relative (%): 22.2%, 27.8%, 27.8%, 22.2% (rounded). Cumulative counts: 4, 9, 14, 18.
Example 3: Cumulative for ordered categories
Suppose product ratings: 1-star (6), 2-star (9), 3-star (15), 4-star (20), 5-star (10). Total N=60.
- Relative (%): 10%, 15%, 25%, 33.3%, 16.7% (approx)
- Cumulative counts: 6, 15, 30, 50, 60
- Cumulative %: 10%, 25%, 50%, 83.3%, 100%
Interpretation tip
Cumulative % answers: What fraction is less than or equal to this level? For ratings, it shows what share rates the product at most a given star level.
Exercises
Complete these before the quick test. You can check your answers inside each task.
Exercise 1: Build a simple frequency table (categorical)
Dataset (20 rows): Email, Chat, Phone, Chat, Email, Email, Social, Phone, Chat, Email, Chat, Email, Phone, Social, Chat, Email, Email, Phone, Chat, Email
- Task: Create a frequency table with counts, relative %, and a total row. Sort by count descending.
- Tip: Relative % should sum to ~100% (allow 1% rounding error).
Exercise 2: Grouped frequency with cumulative
Dataset (18 rows): 3, 7, 12, 5, 9, 14, 16, 2, 8, 11, 13, 4, 18, 6, 15, 1, 10, 19
- Task: Use bins [0, 5), [5, 10), [10, 15), [15, 20). Produce count, relative %, and cumulative count.
- Tip: Confirm that counts sum to 18 and cumulative ends at 18.
Self-check checklist
- Every value fits exactly one category/bin.
- Counts sum to N.
- Relative frequencies sum to ~100%.
- Boundaries are consistent (e.g., lower-inclusive, upper-exclusive).
- Numbers are rounded consistently (e.g., 1 decimal place for %).
Common mistakes and how to avoid them
- Overlapping bins: Fix by using a clear rule like [lower, upper).
- Missing categories: Include an Other bucket or review unique values before counting.
- Percentages dont sum to 100%: Recompute using count / N and standardize rounding.
- Sorting by label when decision needs top categories: Sort by count descending for prioritization.
- Using too many bins: Start with 520 bins (context dependent) and adjust as needed.
Quick self-audit
- Scan totals: Do counts add to N? Does cumulative end at N?
- Spot-check two random rows of raw data against the table.
- If rounding %, check that total is 99%101% at most.
Who this is for
- Beginner and junior Data Analysts.
- Anyone preparing data summaries or dashboards.
- Students learning descriptive statistics.
Prerequisites
- Basic arithmetic (fractions, percentages).
- Comfort with spreadsheets or a scripting language (optional but helpful).
Learning path
- Start: Frequency tables (this page).
- Next: Histograms and bar charts (visualizing frequencies).
- Then: Measures of central tendency and dispersion.
- Later: Probability distributions and sampling.
Practical projects
- Support inbox analysis: Build a frequency table of incoming ticket categories from a CSV.
- Shipping times: Group delivery durations into bins and add cumulative % to see SLA coverage.
- Product usage: Summarize most-used features and highlight top 5 by share.
Mini challenge
You have 50 orders and the delivery_type values: Standard (28), Express (17), Same-day (5). Make a table with counts, %, and cumulative % sorted by % descending. What share is Standard + Express combined?
Show answer
Standard 56%, Express 34%, Same-day 10%. Cumulative % (sorted by %): 56%, 90%, 100%. Combined Standard + Express = 90%.
Quick Test
Take the quick test below. Everyone can take it for free; logged-in users will have their progress saved.
Next steps
- Reuse your frequency tables to build bar charts or histograms.
- Add Pareto analysis (sort by share and compute cumulative % to find the vital few).
- Practice on a new dataset each day for one week; vary categorical vs numeric.