Project: The Data Discovery Dashboard

Congratulations! You've completed your first 10 days of Python for Data Science.

Today, we're building a **Discovery Script**. This is the first thing a professional Data Scientist does when they get a new file.

The Challenge

Analyze a raw CSV of "App Store Reviews."

1. Load the data using Pandas.

2. Check for missing values in the "Rating" and "Review" columns.

3. Fill missing ratings with the average.

4. Filter for reviews with more than 50 characters.

5. Calculate the average rating per App Category.

import pandas as pd

# 1. Load

df = pd.read_csv('apps.csv')

# 2 & 3. Cleaning

df['rating'] = df['rating'].fillna(df['rating'].mean())

# 4. Analysis

# We'll learn string operations soon, for now we use a simple filter

long_reviews = df[df['review_length'] > 50]

# 5. The Insight

report = df.groupby('category')['rating'].mean().sort_values(ascending=False)

print("--- Top Rated Categories ---")

print(report.head(5))

You've moved from "Pulling data" to "Processing data." You've handled dirty inputs, applied business logic, and generated insights using code.

**Phase 2 (Days 111–125): Data Wrangling**. We're going even deeper into Pandas—merging tables, splitting strings, and handling time-series.

See you in Phase 2!