Duplicate Rows After JOINs: Real Debugging
It's **Day 39**, and we're fixing the "Exploding Data" problem.
The Nightmare
You have 10 orders. You join them with `order_items`. Suddenly, your query returns 50 rows. You panic—did you just quintuple your sales? No.
The Cause: Granularity Mismatch
When you join a "high-level" table (Orders) with a "low-level" table (Order Items), the database repeats the high-level data for every matching row on the low-level.
If Order #1 has 5 coffees, joining them will show "Order #1" five times.
How to Debug
1. **Always count your primary table first**: `SELECT COUNT(*) FROM orders` (Result: 10).
2. **Count the join result**: `SELECT COUNT(*) FROM orders JOIN ...` (Result: 50).
3. **Check your ON clause**: Are you joining on a unique ID, or something broad like `order_date`? Joining on dates will cause thousands of duplicates!
The Fix
If you only want unique orders but need to check item data, use `DISTINCT` or a `WHERE EXISTS` subquery instead of a JOIN.
Your Task for Today
Run a join that you know has duplicates. Use `COUNT(DISTINCT id)` to see the true number of entities vs the total number of rows.
*Day 40: How Bad JOINs Destroy Performance.*