Subscribe to DSC Newsletter

Kumaran Ponnambalam's Blog (6)

Predictive Analytics for Unified Communications

More and more organizations today are moving to unified communications (UC) platforms for better communications within their organization, with their customers and with their partners. These platforms combine voice, email, chat and web into a seamless Omni-channel experience for its users. They today boost of a number of features, but most of them provide either static or rule based experiences. Given that these platforms generate tons of data, can this data be used to improve user…

Continue

Added by Kumaran Ponnambalam on March 30, 2016 at 4:30pm — No Comments

Apache Spark and R : The best of both worlds.

As folks working in the field of Data Science and Analytics would know, R is one of the best languages to do data analytics and machine learning. Its simple and easy to use syntax and support for a huge library of capabilities makes it a top Data Science language. But the biggest limitation of R is the amount of data it can process. Its data processing capacity is limited to memory on a single node (at least the free version.).

Apache Spark is taking the Big…

Continue

Added by Kumaran Ponnambalam on March 8, 2016 at 10:04am — 1 Comment

Impact of target class proportions on accuracy of classification

When we try to build classification models from training data, the proportion of target classes do impact the accuracy levels of predictions. This is an experiment to measure the level of impact of these proportions.



Let us say you are trying to predict which visitors to your website would buy a product. You collect historical data about the visitor's characteristics and actions and also whether they brought something or not. This is the model building data…

Continue

Added by Kumaran Ponnambalam on March 20, 2015 at 12:00pm — 5 Comments

Popular Software Skills in Data Science Job postings.

This exercise was done to understand the software skills that are in high demand for Data Science. Analysis was done by extracting the job postings from popular online websites. The findings are interesting. R continues to be the most popular skill, found in 70% of the postings. Python follows as a close second. Surprisingly, in spite all the talk about "Big Data Science", SQL comes up third. This shows that traditional RDBMS still continue to be the base for machine learning work…

Continue

Added by Kumaran Ponnambalam on November 21, 2014 at 1:30pm — 3 Comments

Choosing a classifier for predictions

One of the biggest decisions that a data scientist need to make during a predictive modeling exercise is to choose the right classifier.There is no best classifier for all problems. The accuracy of the classifier varies based on the data set. Correlation between the predictor variables and the outcome is a key influencer. The choice need to be made based on experimentation. There are two main selection criteria here.

Accuracy:  While accuracy of the…

Continue

Added by Kumaran Ponnambalam on November 4, 2014 at 6:08pm — No Comments

Predictions - Effect of unique number of target classes on accuracy

When we perform machine learning of type classification, the target variable is a categorical (nominal) variable that has a set of unique values or classes . It could be a simple two class target variable like "approve application? " with classes (values)  of "yes" or "no". Sometimes they might indicate ranges like "Excellent", "Good" etc. for a target variable like satisfaction score. We might also convert continuous variables like test scores (1 - 100)  into classes like grades (A, B, C…

Continue

Added by Kumaran Ponnambalam on October 30, 2014 at 7:00am — 2 Comments

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2017   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service