Subqueries
Data Validation with Subqueries
Senior Data Analyst
February 28, 2026
5 min read
The Data Quality Audit
Before migrating, I needed to find all the "Bad" data. Subqueries are perfect for this.
Find Orphan Keys
SELECT * FROM orders o
WHERE NOT EXISTS (SELECT 1 FROM customers c WHERE c.id = o.customer_id);
Find Duplicates
SELECT email, COUNT(*) FROM users
GROUP BY email
HAVING COUNT(*) > 1;
Find Referential Violations
Any order pointing to a non-existent product?
SELECT * FROM order_items oi
WHERE product_id NOT IN (SELECT id FROM products);
*Day 60: Phase 3/4 Project—The Master Report.*