Introduction
So far in this course we have studied supervised learning.
Supervised learning means:
We know the answer.
Examples:
| Inputs | Target |
|---|---|
| Age, BMI | Length of Stay |
| Inventory, Price | Sales |
| Customer Features | Churn |
The model learns from known outcomes.
But what if no outcome exists?
Suppose we have customer data:
| Customer | Recency | Frequency | Monetary |
|---|---|---|---|
| A | 5 | 120 | 50000 |
| B | 300 | 10 | 3000 |
| C | 20 | 200 | 100000 |
Question:
Are there natural customer groups?
We do not know the answer beforehand.
This is called:
Unsupervised Learning
One of the most important unsupervised methods is:
Clustering
What is Clustering?
Clustering attempts to find groups of similar observations.
Example:
Customer ACustomer BCustomer CCustomer D
The algorithm may discover:
Cluster 1High Value Customers
Cluster 2Occasional Customers
Cluster 3Dormant Customers
without being told these groups exist.
Real-World Applications
Healthcare
Identify:
- High-risk patients
- Patient subtypes
- Disease phenotypes
Supply Chain
Identify:
- Fast-moving SKUs
- Slow-moving SKUs
- High-value customers
- Similar retailers
Marketing
Identify:
- Customer segments
- Purchasing behaviors
- Loyalty groups
The Most Popular Clustering Algorithm
The most common clustering method is:
K-Means Clustering
The “K” represents:
Number of clusters
Example:
K = 3
means:
Find 3 groups
The Basic Idea
Suppose we have customers.
The algorithm:
Step 1
Creates cluster centers.
Cluster ACluster BCluster C
Step 2
Assigns each customer to the nearest cluster.
Step 3
Recalculates cluster centers.
Step 4
Repeats until stable.
Example Dataset
Suppose we have RFM metrics.
import pandas as pdcustomers = pd.DataFrame({ "Recency":[ 5, 10, 20, 200, 250, 300 ], "Frequency":[ 120, 100, 90, 20, 15, 10 ], "Monetary":[ 50000, 45000, 40000, 5000, 3000, 2000 ]})
Why Scaling Matters
Variables are often measured on different scales.
Example:
Recency:0 - 365
Frequency:0 - 1000
Monetary:0 - 1000000
Without scaling:
Monetary dominatesthe clustering
Standardizing Variables
from sklearn.preprocessing import StandardScalerscaler = StandardScaler()X_scaled = scaler.fit_transform( customers)
Now all variables have similar scales.
Fitting K-Means
from sklearn.cluster import KMeanskmeans = KMeans( n_clusters=3, random_state=42)kmeans.fit(X_scaled)
Viewing Cluster Assignments
customers["Cluster"] = ( kmeans.labels_)print(customers)
Output:
| Recency | Frequency | Monetary | Cluster |
|---|---|---|---|
| 5 | 120 | 50000 | 0 |
| 10 | 100 | 45000 | 0 |
| 20 | 90 | 40000 | 0 |
| 200 | 20 | 5000 | 1 |
| 250 | 15 | 3000 | 1 |
| 300 | 10 | 2000 | 2 |
Interpreting Clusters
Suppose:
Cluster 0
Low RecencyHigh FrequencyHigh Monetary
Interpretation:
Best Customers
Cluster 1
Moderate Activity
Interpretation:
Average Customers
Cluster 2
Old PurchasesLow Spending
Interpretation:
Dormant Customers
Visualizing Clusters
import matplotlib.pyplot as pltplt.scatter( customers["Recency"], customers["Monetary"], c=customers["Cluster"])plt.xlabel("Recency")plt.ylabel("Monetary")plt.show()
Different colors represent different clusters.
Determining the Number of Clusters
One of the biggest challenges:
What value of Kshould we choose?
The Elbow Method
Fit multiple values of K.
inertia = []for k in range(1,11): model = KMeans( n_clusters=k, random_state=42 ) model.fit(X_scaled) inertia.append( model.inertia_ )
Plot:
plt.plot( range(1,11), inertia)plt.xlabel("K")plt.ylabel("Inertia")plt.show()
Understanding the Elbow Plot
The goal is to find:
A sharp bend
in the curve.
Example:
K = 3
may provide the best balance between:
- Simplicity
- Accuracy
Cluster Profiles
One of the most useful steps.
customers.groupby( "Cluster").mean( numeric_only=True)
Example:
| Cluster | Recency | Frequency | Monetary |
|---|---|---|---|
| 0 | 12 | 103 | 45000 |
| 1 | 225 | 18 | 4000 |
| 2 | 300 | 10 | 2000 |
This table helps explain each cluster.
Healthcare Example
Suppose we have:
- Age
- BMI
- Blood Pressure
- Cholesterol
X = patients[ [ "Age", "BMI", "BloodPressure", "Cholesterol" ]]
Apply clustering:
X_scaled = scaler.fit_transform(X)kmeans = KMeans( n_clusters=3, random_state=42)patients["Cluster"] = ( kmeans.fit_predict( X_scaled ))
Question:
Do patient groups exist?
Supply Chain Example
Suppose we have:
- Inventory Days
- Turn
- Sales
- Margin
X = inventory[ [ "InventoryDays", "Turn", "Sales", "Margin" ]]
Cluster retailers:
X_scaled = scaler.fit_transform(X)kmeans = KMeans( n_clusters=4, random_state=42)inventory["Cluster"] = ( kmeans.fit_predict( X_scaled ))
Possible output:
Cluster 0Top Performers
Cluster 1Growing Stores
Cluster 2Declining Stores
Cluster 3Low Activity Stores
Hierarchical Clustering
Another clustering approach.
Instead of specifying:
K = 3
the algorithm builds a hierarchy.
Import:
from scipy.cluster.hierarchy import ( dendrogram, linkage)
Fit:
linked = linkage( X_scaled, method="ward")
Plot:
dendrogram(linked)plt.show()
Advantages of K-Means
Fast
Works well on large datasets.
Easy to Understand
Simple concept.
Excellent for Segmentation
Perfect for:
- Customers
- Patients
- Retailers
- Products
Limitations
Must Choose K
Not always obvious.
Sensitive to Scaling
Always standardize first.
Assumes Spherical Clusters
May miss complex patterns.
Typical Analyst Workflow
Step 1
Select variables.
X = df[ ["Recency", "Frequency", "Monetary"]]
Step 2
Scale.
StandardScaler()
Step 3
Determine K.
Elbow Method
Step 4
Fit K-Means.
KMeans()
Step 5
Profile clusters.
groupby("Cluster")
Step 6
Interpret business meaning.
Real-World Example: RFM Segmentation
This is one of the most common business uses of clustering.
Variables:
- Recency
- Frequency
- Monetary
The algorithm often discovers:
VIP Customers
Regular Customers
At-Risk Customers
Dormant Customers
without predefined labels.
Practical Healthcare Exercise
Cluster patients using:
- Age
- BMI
- Blood Pressure
- Cholesterol
Questions:
- How many patient groups exist?
- Which cluster has the highest risk profile?
Practical Supply Chain Exercise
Cluster customers using:
- Recency
- Frequency
- Monetary
Questions:
- Which customers are VIPs?
- Which customers need reactivation?
- Which customers have growth potential?
Lesson Summary
In this lesson we learned:
- Unsupervised Learning
- K-Means Clustering
- Feature Scaling
- Cluster Assignment
- Cluster Profiling
- Elbow Method
- Hierarchical Clustering
- Customer Segmentation
- Healthcare Applications
- Supply Chain Applications
Clustering is one of the most powerful exploratory tools because it reveals hidden structure in data without requiring labeled outcomes.
In the next lesson we will study Time Series Forecasting, where we learn how to predict future values such as sales, demand, inventory levels, and hospital admissions.

Leave a Reply