Selecting Columns and Slicing Data

It's **Day 106**, and we're slicing. In SQL, you just list columns in your `SELECT`. In Pandas, there are several ways to do it, and knowing the "Best" way will save you from bugs.

1. Simple column selection

# Single column (returns a Series)

emails = df['email']

# Multiple columns (returns a DataFrame)

user_info = df[['name', 'email']]

2. The Power Players: .loc and .iloc

**.loc[rows, cols]**: Label-based. You use the names of the rows or columns.

**.iloc[rows, cols]**: Integer-based. You use the numerical position (0, 1, 2...).

# Get the 'email' for the first 5 rows using .iloc

emails_top5 = df.iloc[0:5, 2] # 2 is the index of the email column

# Get all rows where city is Paris, only the 'name' column

paris_names = df.loc[df['city'] == 'Paris', 'name']

Why use .loc?

Explicit is better than implicit. Using `.loc` makes your code much easier to read and prevents "SettingWithCopyWarning" errors when you try to change values.

Your Task for Today

Select the first 10 rows and the last 2 columns of your DataFrame using `.iloc`.

*Day 107: Handling Missing Data (NULLs in Python).*

1. Simple column selection

2. The Power Players: .loc and .iloc

Why use .loc?

Your Task for Today

Ready to put your knowledge into practice?