Selecting Columns and Slicing Data
It's **Day 106**, and we're slicing. In SQL, you just list columns in your `SELECT`. In Pandas, there are several ways to do it, and knowing the "Best" way will save you from bugs.
1. Simple column selection
# Single column (returns a Series)
emails = df['email']
# Multiple columns (returns a DataFrame)
user_info = df[['name', 'email']]
2. The Power Players: .loc and .iloc
# Get the 'email' for the first 5 rows using .iloc
emails_top5 = df.iloc[0:5, 2] # 2 is the index of the email column
# Get all rows where city is Paris, only the 'name' column
paris_names = df.loc[df['city'] == 'Paris', 'name']
Why use .loc?
Explicit is better than implicit. Using `.loc` makes your code much easier to read and prevents "SettingWithCopyWarning" errors when you try to change values.
Your Task for Today
Select the first 10 rows and the last 2 columns of your DataFrame using `.iloc`.
*Day 107: Handling Missing Data (NULLs in Python).*