Machine Learning：K-Means Overview

Published On 2021/12/29 wednesday, Singapore

The K-Means algorithm is one of the most widely used clustering methods in practice. It is categorized as unsupervised learning which learns from unlabelled data instead of from labelled data, and try to find the “structure” or “pattern” in the data. Also as a type of clustering algorithm, it aims to automatically group the data to coherent clusters[1]. Typical use cases include customer segmentation[2], social network analysis, and document clustering.

This post is an overview post and works as the directory for K-Means posts.

KMeans Intuition and Mathematics
- Cost Function, Optimization, and Initialization
Prepare data for K-means clustering <!– - Use Cases
- Customer Segmentation
  - Data come from Customer Data Platform CDP, and once clustering is done, the cluster IDs are feed back to CDP for targeting.
  - Data comes from survey, and once clustering is done, and doing comparative analysis across segments. Based on the comparative analysis, we can choose an descriptive name.
  - The key is finding a balance between the number of clusters and how actionable their insights are for the marketing purposes.Too many clusters isn’t practical. You would have different segments with very specific attributes and you’d miss commonalities among the groups. Too few clusters would make customers who might be significantly different from one another part of the same segments, and you’d miss an opportunity to tailor your message.
–>

Reference & Resources

Unsupervised Learning, Machine Learning from Standford University
Supervised Machine Learning course, IBM
Different Approaches to Segmentation, Data Analytics Methods for Marketing, Meta
k-Means, Kirenz’s Blog on Clustering.

💚 Back to Home