How NULLs Affect Aggregations

The Discrepancy Report

The finance team noticed a discrepancy. The `SUM(price)` from my report didn't match the spreadsheet they had. After hours of debugging, we found the culprit: 150 orders had a `NULL` price.

The Quest: NULL Behavior in Aggregations

SQL aggregation functions do not include NULLs in their calculations.

`SUM()`: Ignores NULLs (they contribute 0).

`AVG()`: Ignores NULLs (they don't affect the average).

`COUNT(column)`: Ignores NULLs.

`COUNT(*)`: Counts all rows, regardless of NULLs.

The Implementation: Safe Aggregations

Instead of just `SUM(price)`, I always pair it with a count to see if any data is missing.

SELECT

SUM(price) AS total_revenue,

COUNT(price) AS rows_with_price,

COUNT(*) AS total_rows,

(COUNT(*) - COUNT(price)) AS rows_missing_price

FROM orders;

GROUP BY and NULLs

If شما `GROUP BY country` and some rows have a NULL country, they will all be grouped into a single "NULL" bucket.

The "Oops" Moment

I once handed over an average salary report, not realizing that part-time contractors had NULL salary fields. The average looked much higher than reality because the low (or zero) earners were invisible.

**Pro Tip**: When aggregating, always check `COUNT(column)` vs `COUNT(*)`. The difference is your "Ghost Data."

The Victory

By making NULLs visible in my reports, I helped the data engineering team track down bad integrations. The "Missing Data" itself became valuable information.

Your Task for Today

Run a `SUM()` on a column that has some NULLs. Then, compare `COUNT(column)` to `COUNT(*)`.

*Day 30: Phase 2 Project—The Executive Summary Report.*