Subqueries

Using CTEs for Data Cleaning Pipelines

Senior Data Analyst
February 25, 2026
6 min read

The Messy Data Problem

Raw data from a vendor had issues: typos in names, missing dates, duplicate rows. I needed to clean it in stages.

The Pipeline CTE

WITH

-- Stage 1: Standardize

cleaned AS (

SELECT UPPER(TRIM(name)) AS name, price, order_date

FROM raw_orders

),

-- Stage 2: Deduplicate

deduped AS (

SELECT DISTINCT * FROM cleaned

),

-- Stage 3: Fill defaults

final AS (

SELECT name, COALESCE(price, 0) AS price, order_date

FROM deduped

WHERE order_date IS NOT NULL

)

SELECT * FROM final;

Pro Tip

This is the foundation of ETL (Extract, Transform, Load). CTEs let you build transparent data pipelines.

*Day 57: Subquery Performance Traps.*

Ready to put your knowledge into practice?

Join SQL Mastery and learn through interactive exercises.