Subscribe to DSC Newsletter

Data is like horses; it can be untamed and unmanageable or it can be trained and useful. Taking data from the one state to the other can be routine and fit to the ken of any data scientist. But then, sometimes there are wild Clydesdale data sets, big bucking blobs of inconsistency that require the skills of a Data Whisperer.

A Data Whisperer is someone who can see an analytics problem from the data's perspective. The DW understands where the data came from, how it has been used, what are its strengths and weaknesses, and what it needs in order to be reliable, safe, and useful.

Consider a large data set with a lot of missing values.  Making sense out of the data that is there can depend greatly on what one does about the data that is not there. A tenderfoot might just it fence it off and work around it. A seasoned buckaroo might try and saddle it with formulaic proxy data. But, a Data Whisperer might use machine learning to predict the missing feature values from the ones that are present.

And, the Data Whisperer does in vivo data science, studying the data in action in its natural habitat, observing its confirmation at rest and full gallop, how it is fed, housed, and cared for, and, most importantly, what it was bred to do.  (No amount of whispering is going to turn a trotter into a thoroughbred or a reading into a tranaction.

Are you a Data Whisperer?  When you start a project, what steps do you take to get to know the data on a deeper level? How do you use that knowledge to enrich the data and to guide your workflow?  

Views: 2947

Tags: Data Whisperer, training data

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Pelita Novena Aprila on February 21, 2016 at 8:05pm

Well, the first thing I do when I got data is looking at its structure and the variables. Is it stochastic, spacial, or else.

The next thing I search for every possibility that can be analized using that data.

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service