Projects/Customer Segmentation App
Data Science / Machine Learning
completed
2 Months

Customer Segmentation App

ML-powered CRM analytics for actionable business intelligence.

Customer Segmentation App hero

Project Overview

Businesses often sit on massive datasets without clear segmentation. This app bridges the gap by providing a self-service tool for RFM (Recency, Frequency, Monetary) analysis, allowing non-technical stakeholders to identify high-value customer segments through data-driven clustering.

~94%
Segment Accuracy
<2s
Processing Speed
Large CSV
Data Scalability

Key Features

Automated RFM Calculation

Intelligent mapping of transactional data into Recency, Frequency, and Monetary scores.

Interactive KMeans Clustering

Dynamic cluster sizing with visual feedback using Scikit-learn.

Dynamic Data Export

Generate summaries and export segmented customer lists for targeted marketing.

System Architecture

A Python-centric data processing pipeline focused on speed and mathematical accuracy.

Processing Layer

Data cleaning, encoding detection, and feature engineering for RFM scores.

Pandas
NumPy

ML Engine

KMeans implementation with dynamic hyperparameter tuning for optimal clustering.

Scikit-learn

Visualization

Interactive 3D and 2D charts for cluster distribution and density analysis.

Plotly
Seaborn

Engineering Challenges

Handling inconsistent date formats and monetary currency symbols across various user-uploaded datasets.

Developed a robust regex-based preprocessing engine that automatically sanitizes and normalizes datetime and float columns.

Ensuring high-speed clustering for datasets exceeding 100k rows in a browser environment.

Implemented efficient data vectorization with NumPy and optimized the KMeans initialization state (k-means++).

Screenshot Gallery

Screenshot gallery
Screenshot gallery
Screenshot gallery
Screenshot gallery
Screenshot gallery
Screenshot gallery

Tech Stack

Python
Streamlit
Pandas
NumPy
Scikit-learn
Matplotlib
Seaborn
Plotly
KMeans Clustering
RFM Analysis

My Role

Data Scientist & Developer
  • Architected the data pipeline and ML model selection.
  • Implemented the Streamlit frontend for interactive parameter tuning.
  • Created the automated RFM calculation logic and visualization suite.

Continue Exploring