Duplicate Rows After JOINs: Real Debugging

It's **Day 39**, and we're fixing the "Exploding Data" problem.

The Nightmare

You have 10 orders. You join them with `order_items`. Suddenly, your query returns 50 rows. You panic—did you just quintuple your sales? No.

The Cause: Granularity Mismatch

When you join a "high-level" table (Orders) with a "low-level" table (Order Items), the database repeats the high-level data for every matching row on the low-level.

If Order #1 has 5 coffees, joining them will show "Order #1" five times.

How to Debug

1. **Always count your primary table first**: `SELECT COUNT(*) FROM orders` (Result: 10).

2. **Count the join result**: `SELECT COUNT(*) FROM orders JOIN ...` (Result: 50).

3. **Check your ON clause**: Are you joining on a unique ID, or something broad like `order_date`? Joining on dates will cause thousands of duplicates!

The Fix

If you only want unique orders but need to check item data, use `DISTINCT` or a `WHERE EXISTS` subquery instead of a JOIN.

Your Task for Today

Run a join that you know has duplicates. Use `COUNT(DISTINCT id)` to see the true number of entities vs the total number of rows.

*Day 40: How Bad JOINs Destroy Performance.*

The Nightmare

The Cause: Granularity Mismatch

How to Debug

The Fix

Your Task for Today

Ready to put your knowledge into practice?