Join Performance Tuning
The Slow Query
My multi-table join used to take 1 second. After a few months of data growth, it took 45 seconds. The query logic hadn't changed, but the performance had tanked.
The Quest: Understanding the Optimizer
When SQL encounters a JOIN, the query optimizer decides **how** to execute it:
The optimizer makes these decisions based on table statistics (row counts, unique values).
The Implementation: The Index Solution
The most common fix is to add indexes on the columns used in `ON` clauses.
-- Ensure the join keys are indexed
CREATE INDEX idx_orders_customer_id ON orders (customer_id);
CREATE INDEX idx_customers_id ON customers (id);
Then, use `EXPLAIN ANALYZE` to see the plan:
EXPLAIN ANALYZE
SELECT o.*, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id;
Look for "Index Scan" (good) vs "Seq Scan" (slow).
The "Oops" Moment
I once added an index but saw no improvement. The reason: I was joining on a function like `LOWER(email)`, but the index was on `email`. The index couldn't be used!
**Pro Tip**: If you transform a column in your `ON` clause (e.g., `LOWER`, `DATE_TRUNC`), you need a **Functional Index** to match.
The Victory
After adding the right indexes, my 45-second query dropped to 0.5 seconds. Performance tuning is the difference between a toy project and a production system.
Your Task for Today
Run `EXPLAIN ANALYZE` on your slowest join query and look for "Seq Scan" operations. Consider adding indexes.
*Day 43: Legacy Joins and Compatibility.*