EXISTS vs IN in Real Systems

Welcome to **Day 48**. We touched on this in Phase 1, but now that we know subqueries, we can understand the real performance difference between `IN` and `EXISTS`.

The Theory

**IN**: "Is this ID in this list of IDs?"

**EXISTS**: "Does at least one row match this condition?"

The Performance Secret

In modern databases, `EXISTS` is often faster when the list of values is very large. Why?

Because `EXISTS` stops searching the moment it finds **one** match (Short-circuiting). `IN` sometimes tries to build the entire list of IDs in memory before it starts filtering.

Real-World Example: Customer Audit

"Find all customers who have ever placed an order."

-- Using IN

SELECT * FROM customers

WHERE id IN (SELECT customer_id FROM orders);

-- Using EXISTS (Usually faster for millions of orders)

SELECT * FROM customers c

WHERE EXISTS (

SELECT 1 FROM orders o WHERE o.customer_id = c.id

);

Pro Tip: NOT EXISTS

`NOT EXISTS` is the gold standard for finding "missing" records. It's much safer than `NOT IN`, which can return 0 rows if there is a single `NULL` in the subquery result!

Your Task for Today

Rewrite an `IN` subquery as an `EXISTS` subquery and see if you notice any speed difference.

*Day 49: Why Subqueries Can Be Slow.*

The Theory

The Performance Secret

Real-World Example: Customer Audit

Pro Tip: NOT EXISTS

Your Task for Today

Ready to put your knowledge into practice?