Subscribe to DSC Newsletter

June 2014 Blog Posts (36)

Interview with Prof. Dr. Bart Baesens - Author of Multiple Business Analytics Books

Professor Bart Baesens is a professor at KU Leuven (Belgium), and a lecturer at the University of Southampton (United Kingdom).  He has done extensive research on analytics, customer relationship management, web analytics, fraud detection, and credit risk management.  His findings have been published in well-known international journals (e.g. Machine Learning, Management Science, IEEE Transactions on Neural Networks, IEEE Transactions on Knowledge and Data Engineering, IEEE…


Added by Vincent Granville on June 11, 2014 at 10:30am — No Comments

Free stuff for data science publishers, authors, bloggers, professors, alumni, event organizers and more

Here are several different ways to leverage Data Science Central for your benefit, at no cost.

  1. For alumni, professors, students and people in academia. List you data science program in our training section. Include a picture of the campus or…

Added by Vincent Granville on June 10, 2014 at 8:00pm — No Comments

Comprehensive list of data science resources

Here we blended together the best of the best resources posted recently on DSC. It would be great to organize them by category, but for now they are organized by date. This is very useful too, since you are likely to have seen old entries already, and can focus on more recent stuff. We plan to update this reference of references on a regular basis.…


Added by Mirko Krivanek on June 10, 2014 at 3:00pm — No Comments

How Cross Pollination Helps Organizations to Ideate & Innovate?

One of the marvels that the age of data and technology presents is the ability to analyze and determine the minutest of details in the world today. Several of these innovative breakthroughs pass unnoticed under the gaze of daily life. Yet it is this dissemination of data and integration of innovation that is intrinsic the modern world. One field which has risen from the fore of the data deluge is ‘…


Added by Sumit Prasad on June 9, 2014 at 9:51pm — No Comments

Modelling a Data Warehouse

When designing a model for a data warehouse we should follow standard pattern, such as gathering requirements, building credentials and collecting a considerable quantity of information about the data or metadata. This helps to figure out the formation and scope of the data warehouse. This model of data warehouse is known as conceptual model. General elements for the model are fact and dimension tables. These tables will be related to each other which will help to identity relationships…


Added by Avesh Dhakal on June 8, 2014 at 7:54am — 1 Comment

Data Shinobi: 2 - The Tree of the Data Shinobi

Having looked at the fundamentals in the first blog, the natural next step is to understand the various types of strategies to "attack" the data and make it reveal useful information. However, there is one step we must take just before that: Understand the "enemy" i.e. the problem at hand and the data available.

The Tree of the Data Shinobi:

The tree below is an attempt at categorizing the most commonly…


Added by Amogh Borkar on June 8, 2014 at 2:30am — No Comments

Thresholds, Butterflies, and the Metrics of Phenomena

My favourite explanation of the "butterfly effect" so far is as follows: Under particular conditions, even the tiniest movements of a butterfly can trigger storms and hurricanes. This principle is not limited to butterflies, of course. I think that many of us face pivotal moments in life that leave lasting effects. Perhaps no different than other students, I remember running out of cash during my undergraduate years. I consider this my personal butterfly moment. I had no money for food. I…


Added by Don Philip Faithful on June 7, 2014 at 7:33am — No Comments

About the Curse of Dimensionality


In this article, we will discuss the so called 'Curse of Dimensionality', and explain why it is important when designing a classifier. In the following sections I will provide an intuitive explanation of this concept, illustrated by a clear example of overfitting due to the curse of dimensionality.

Consider an example in which we have a set of images, each of which depicts either a cat or a dog. We would like to create a classifier that is able to…


Added by Vincent Spruyt on June 6, 2014 at 11:41pm — 3 Comments

A Tour of Machine Learning Algorithms

Originally published by Jasonb on…Ensemble Learning Method


Added by Mirko Krivanek on June 6, 2014 at 5:00pm — No Comments

100+ Interesting Data Sets for Data Science

Read full list if you find these examples interesting.…


Added by Mirko Krivanek on June 6, 2014 at 5:00pm — 1 Comment

Weekly digest - June 9

The full version is always published Monday. Starred articles are new additions or updated content, posted between Thursday and Sunday

Featured Contributions


Added by Vincent Granville on June 4, 2014 at 7:00pm — No Comments

NoSQL & NewSQL Database Adoption 2014

While MongoDB has been the most popular NoSQL database over the past few years, it appears Cassandra is most popular over the past six months. Many assert that Cassandra has superior scalability, better data management features, is faster and MongoDB has more moving parts and complexity to cause…


Added by Michael Walker on June 4, 2014 at 4:00pm — 1 Comment

Great data challenge: factoring the product of two large primes

How is this related to big data and data science, and why is it such a big deal?

It is important big data science in multiple ways. First, data security and encryption relies on algorithms that typically use an encryption key: the key - at the very core of these algorithms - is essentially the product of two very large prime numbers. While there has been new developments to produce different algorithms…


Added by Vincent Granville on June 4, 2014 at 1:30pm — 5 Comments

40 maps that explain the Internet

Interesting article about the history of the Internet, with some really cool maps. Our upcoming "picture of the week"  will come from this article. Check out our most recent weekly digests to discover our previous picture of the week.

Article posted by Timothy B. Lee  on…


Added by Vincent Granville on June 3, 2014 at 4:00pm — No Comments

Data Science Has Been Using Rebel Statistics for a Long Time

Many of those who call themselves statisticians just won't admit that data science heavily relies on and uses (heretical, rule-breaking) statistical science, or they don't recognize the true statistical nature of these data science techniques (some are 15-year old), or are opposed to the modernization of their statistical arsenal. They already missed the train when machine learning became a popular discipline (also heavily based on statistics) more than 15 years ago. Now machine learning…


Added by Vincent Granville on June 1, 2014 at 9:30am — 5 Comments

Blog Topics by Tags

Monthly Archives













  • Add Videos
  • View All

© 2020   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service