An Introduction to Clustering and different methods of clustering

This article was written by Saurav Kaushik. Saurav is a Data Science enthusiast, currently in the final year of his graduation at MAIT, New Delhi. He loves to use machine learning and analytics to solve complex data problems.

Introduction

Have you come across a situation when a Chief Marketing Officer of a company tells you – “Help me understand our customers better so that we can market our products to them in a better manner!”

I did and the analyst in me was completely clueless what to do! I was used to getting specific problems, where there is an outcome to be predicted for various set of conditions. But I had no clue, what to do in this case. If the person would have asked me to calculate Life Time Value (LTV) or propensity of Cross-sell, I wouldn’t have blinked. But this question looked very broad to me!

This is usually the first reaction when you come across a unsupervised learning problem for the first time! You are not looking for specific insights for a phenomena, but what you are looking for are structures with in data with out them being tied down to a specific outcome.

The method of identifying similar groups of data in a data set is called clustering. Entities in each group are comparatively more similar to entities of that group than those of the other groups. In this article, I will be taking you through the types of clustering, different clustering algorithms and a comparison between two of the most commonly used cluster methods.

Overview
Types of Clustering
Types of Clustering Algorithms
K Means Clustering
Hierarchical Clustering
Difference between K Means and Hierarchical clustering
Applications of Clustering
Improving Supervised Learning algorithms with clustering

To check out all this information, click here. For other articles about clustering, click here.

Top DSC Resources

Article: What is Data Science? 24 Fundamental Articles Answering This Question
Article: Hitchhiker’s Guide to Data Science, Machine Learning, R, Python
Tutorial: Data Science Cheat Sheet
Tutorial: How to Become a Data Scientist – On Your Own
Categories: Data Science – Machine Learning – AI – IoT – Deep Learning
Tools: Hadoop – DataViZ – Python – R – SQL – Excel
Techniques: Clustering – Regression – SVM – Neural Nets – Ensembles – Decision Trees
Links: Cheat Sheets – Books – Events – Webinars – Tutorials – Training – News – Jobs
Links: Announcements – Salary Surveys – Data Sets – Certification – RSS Feeds – About Us
Newsletter: Sign-up – Past Editions – Members-Only Section – Content Search – For Bloggers
DSC on: Ning – Twitter – LinkedIn – Facebook – GooglePlus

An Introduction to Clustering and different methods of clustering

Introduction

Table of Contents