Intermediate-level SQL interview questions for Data Analyst positions.
1. Write a query to find duplicate records in a table.
Answer: Use GROUP BY with HAVING COUNT(*) > 1 to find duplicates. You can also use ROW_NUMBER() window function to identify and remove duplicates while keeping one copy.
-- Find duplicate emails SELECT email, COUNT(*) as count FROM users GROUP BY email HAVING COUNT(*) > 1; -- Find and number duplicates WITH numbered AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY email ORDER BY id) as rn FROM users ) SELECT * FROM numbered WHERE rn > 1;
2. How would you calculate month-over-month growth rate?
Answer: Use LAG() window function to access the previous month's value, then calculate percentage change: (current - previous) / previous * 100.
WITH monthly AS ( SELECT DATE_TRUNC('month', order_date) AS month, SUM(revenue) AS revenue FROM orders GROUP BY DATE_TRUNC('month', order_date) ) SELECT month, revenue, LAG(revenue) OVER (ORDER BY month) AS prev_month, ROUND((revenue - LAG(revenue) OVER (ORDER BY month)) / LAG(revenue) OVER (ORDER BY month) * 100, 2) AS growth_pct FROM monthly;
3. Explain the difference between RANK(), DENSE_RANK(), and ROW_NUMBER().
Answer: ROW_NUMBER() assigns unique sequential numbers (1,2,3,4). RANK() gives same rank to ties but skips numbers (1,2,2,4). DENSE_RANK() gives same rank to ties without skipping (1,2,2,3). Choose based on how you want to handle ties.
SELECT name, score, ROW_NUMBER() OVER (ORDER BY score DESC) as row_num, RANK() OVER (ORDER BY score DESC) as rank, DENSE_RANK() OVER (ORDER BY score DESC) as dense_rank FROM students; -- If two students have same score: -- row_num: 1,2,3 rank: 1,1,3 dense_rank: 1,1,2