Data Science

Handling Missing Data: NULLs in Python

SQL Mastery Team
May 8, 2026
5 min read

Welcome to **Day 107**. In SQL, we have `NULL`. In Pandas, we have `NaN` (Not a Number) or `None`.

Detecting the Gaps

How many rows are missing data?

print(df.isnull().sum())

Strategy 1: The "Lazy" Way (Drop them)

If you have millions of rows and only 100 are missing, just delete them.

df_clean = df.dropna()

Strategy 2: The "Safe" Way (Fill them)

Similar to `COALESCE` in SQL, we can replace missing values with a default (like the Average or 'Unknown').

# Replace missing ages with the average age

df['age'] = df['age'].fillna(df['age'].mean())

Why this is critical for Machine Learning

Most ML algorithms will crash if there is a single `NaN` in your data. Learning how to "Impute" (fill) data is a core Data Science skill.

Your Task for Today

Count the NULLs in a dataset and fill them with a sensible default value for that column.

*Day 108: Sorting and Ranking in Pandas.*

Ready to put your knowledge into practice?

Join SQL Mastery and learn through interactive exercises.