Subscribe to DSC Newsletter

All Blog Posts Tagged 'clustering' (12)

The problem with Cambridge Analytica is not (just) the privacy breach

The problem with Cambridge Analytica is not the privacy breach

TL;DR: Echo chambers created by CA or other political marketing firms are bad for democracy, but you can counter them by following pages, people and content you would normally not follow on FB.…
Continue

Added by Jesus Ramos on August 19, 2019 at 10:19am — No Comments

Fine grained analysis of K- mean clustering and where we are using it

K-means is a centroid based algorithm that means points are grouped in a cluster according to the distance(mostly Euclidean) from centroid.

Centroid-based…

Continue

Added by satyajit maitra on July 1, 2019 at 6:30am — No Comments

High Density Region Estimation with KernelML

Data scientists and predictive modelers often use 1-D and 2-D aggregate statistics for exploratory analysis, data cleaning, and feature creation. Higher dimensional aggregations, i.e., 3 dimensional and above, are more difficult to visualize and understand. High density regions are one example of these N-dimensional statistics. High density regions can be useful for summarizing common characteristics across multiple variables. Another use case is to validate a forecast prediction’s…

Continue

Added by Rohan Kotwani on January 3, 2019 at 4:00pm — No Comments

Clustering – Algorithms for Partitioning and Assignments

K-means algorithm is a popular and efficient approach for clustering and classification of data. My first introduction to K-means algorithm was when I was conducting research on image compression. In this applications, the purpose of clustering was to provide the ability to represent a group of objects or vectors by only one object/vector with an acceptable loss of information. More specifically, a clustering process in which the centroid of the cluster was optimum for the cluster and the…

Continue

Added by Faramarz Azadegan on October 31, 2018 at 7:06am — No Comments

Machine Learning Algorithms: Which One to Choose for Your Problem

When I was beginning my way in data science, I often faced the problem of choosing the most appropriate algorithm for my specific problem. If you’re like me, when you open some article about machine learning algorithms, you see dozens of detailed descriptions. The paradox is that they don’t ease the choice.

In this article, I will try to explain basic concepts and give some intuition of using different…

Continue

Added by Luba Belokon on October 26, 2017 at 6:00am — No Comments

Recommendation System Algorithms

Today, many companies use big data to make super relevant recommendations and growth revenue. Among a variety of recommendation algorithms, data scientists need to choose the best one according a business’s limitations and requirements.

To simplify this task, my team has prepared an overview of the main existing recommendation system…

Continue

Added by Luba Belokon on July 28, 2017 at 4:00am — No Comments

R Clustering – A Tutorial for Cluster Analysis with R

1.Objective

First of all we will see what is R Clustering, then we will see the Applications of Clustering, Clustering by Similarity Aggregation, use of R amap Package, Implementation of Hierarchical Clustering in R and examples of R clustering in various fields.

2. Introduction to Clustering in…

Continue

Added by Sheetal Sharma on July 19, 2017 at 9:00pm — No Comments

Book: Text Analytics with Python

Text Analytics with Python -- A Practical Real-World Approach to Gaining Actionable Insights from your Data

Text analytics can be a bit overwhelming and frustrating at times with the unstructured and noisy nature of textual data and the vast amount of information available. "Text Analytics with Python" published by Apress\Springer, is a book packed with 385 pages of useful information based on techniques, algorithms,…

Continue

Added by Dipanjan Sarkar on July 14, 2017 at 4:00am — No Comments

Coding graphs for data mining in Python using Java platform

Graphs belong to the field of mathematics, graph theory. For data analysis that requires searches of particular patterns, graph-based data mining becomes an important technique. Indeed, in real life, most of the data we have to deal with can be represented as graphs. A typical graph consists of vertices (nodes, cells), and of edges that…

Continue

Added by jwork.ORG on June 19, 2017 at 5:30pm — No Comments

Feature engineering for building clustering models

We frequently get questions about whether we have chosen all the right parameters to build a machine learning model. There are two scenarios: either we have sufficient attributes (or variables) and we need to select the best ones OR we have only a handful of attributes and we need to know if these are impactful. Both are classic examples of feature engineering challenges

Most of the…

Continue

Added by BR Deshpande on April 16, 2016 at 9:00am — No Comments

Who are alike? Use BigObject feature vector to find similarities

Cluster Analysis is a common technique to group a set of objects in the way that the objects in the same group share certain attributes. It’s commonly used in marketing and sales planning to define market segmentations.



Here at BigObject we adopt a simple approach to exploring the similarities between…

Continue

Added by Yuanjen Chen on October 2, 2015 at 1:21pm — No Comments

What clustering method is required for text documents

Let's say a set of documents 'S' has a large set of 'pure' texts.

On all documents in S, I am spelling normalisation method, which yields a normalised set S'.

Then I use the chosen method M (which method? ) to make clusters in S, obtaining a clustering result C.

Then I use the same method M to make clusters in S', obtaining a clustering results C'.

Finally I need to compare if there are statistically significant differences between C and C'.

Any help in identifying…

Continue

Added by MUSHTAQ AHMAD on May 25, 2015 at 11:48am — 3 Comments

Monthly Archives

2019

2018

2017

2016

2015

2014

2013

2012

2011

1999

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service