K-Means Clustering
It's **Day 148**, and we're letting the computer think for itself. This is **Unsupervised Learning**.
What is K-Means?
You tell the computer to find "K" groups. It looks at the dots and groups them based on how close they are to each other (Similarity).
The Code
from sklearn.cluster import KMeans
# Group customers based on Spend and Age
X = df[['spend', 'age']]
# We want to find 3 groups (e.g. Budget, Luxury, Average)
kmeans = KMeans(n_clusters=3)
df['cluster_id'] = kmeans.fit_predict(X)
# Visualize it!
sns.scatterplot(data=df, x='spend', y='age', hue='cluster_id')
Why Marketers love this
You can discover "Segments" you didn't know existed. "Oh look, we have a group of young users who spend a lot on Tuesdays!" Now شما can send them a targeted coupon.
Your Task for Today
Run K-Means on a dataset and see if the resulting clusters make sense logically.
*Day 149: Feature Engineering: The Secret Sauce.*