String Manipulation in Pandas

Welcome to **Day 117**. Today we clean text. In SQL, we had `TRIM`, `LOWER`, and `REPLACE`. In Pandas, we have the `.str` accessor.

Common Operations

# Lowercase everything

df['name'] = df['name'].str.lower()

# Remove leading/trailing spaces

df['name'] = df['name'].str.strip()

# Find and Replace

df['category'] = df['category'].str.replace('Tech', 'Technology')

Want to find all rows where the description contains the word "Sale"?

on_sale = df[df['description'].str.contains('Sale', case=False)]

Just like we learned on Day 83:

# Split 'Full Name' into first and last

df[['first', 'last']] = df['full_name'].str.split(' ', expand=True)

Clean a column by lowercasing it, stripping whitespace, and checking if it starts with the letter 'A'.

*Day 118: Hierarchical Indexing (MultiIndex).*