SQL Basics

Text Cleanup: Fundamental String Functions

Senior Data Analyst
January 12, 2026
6 min read

The Messy Database

I was merging two customer lists when I realized the problem. One list had "LONDON," another had "london," and a third had " London " (with a space). To a computer, those are three different cities. To a human, they are the same place.

My reports were a mess. I needed to standardize the data.

The Quest: The Polishers

In the real world, data is rarely "Clean." People make typos, caps-lock gets left on, and legacy systems add extra spaces. As an analyst, you must "Polish" the text before you present it.

Three essential tools are:

1. **UPPER / LOWER**: Standardize case.

2. **TRIM**: Remove accidental spaces at the start or end.

3. **CONCAT**: Glue two pieces of text together.

The Implementation: The Standardizer

I wrote a query to "Fix" the city names on the fly.

-- Cleaning up messy customer data

SELECT

TRIM(UPPER(city)) AS clean_city,

CONCAT(first_name, ' ', last_name) AS full_name

FROM

customers;

Why This is Critical

If you want to count customers in London, and half are lowercase, your count will be wrong. By using `UPPER()`, you ensure that everything matches perfectly.

The "Oops" Moment

I once forgot to include a space in my `CONCAT` function: `CONCAT(first, last)`. The result was `JohnDoe`. Not exactly a professional directory.

**Pro Tip**: Always remember the space! `CONCAT(first, ' ', last)`.

The Victory

The "Dirty" list of 100 variations became a clean list of 5 standardized cities. My aggregation was accurate for the first time. I learned that data analysis is 80% cleaning and 20% calculating.

Your Task for Today

Choose a text column that looks inconsistent. Use `UPPER(TRIM(column_name))` and see how much more consistent the results become.

*Day 13: The Mystery of NULL—Unknown vs. Empty.*

Ready to put your knowledge into practice?

Join SQL Mastery and learn through interactive exercises.