Data is like horses; it can be untamed and unmanageable or it can be trained and useful. Taking data from the one state to the other can be routine and fit to the ken of any data scientist. But then, sometimes there are wild Clydesdale data sets, big bucking blobs of inconsistency that require the skills of a Data Whisperer.
A Data Whisperer is someone who can see an analytics problem from the data's perspective. The DW understands where the data came from, how it has been used, what are its strengths and weaknesses, and what it needs in order to be reliable, safe, and useful.
Consider a large data set with a lot of missing values. Making sense out of the data that is there can depend greatly on what one does about the data that is not there. A tenderfoot might just it fence it off and work around it. A seasoned buckaroo might try and saddle it with formulaic proxy data. But, a Data Whisperer might use machine learning to predict the missing feature values from the ones that are present.
And, the Data Whisperer does in vivo data science, studying the data in action in its natural habitat, observing its confirmation at rest and full gallop, how it is fed, housed, and cared for, and, most importantly, what it was bred to do. (No amount of whispering is going to turn a trotter into a thoroughbred or a reading into a tranaction.
Are you a Data Whisperer? When you start a project, what steps do you take to get to know the data on a deeper level? How do you use that knowledge to enrich the data and to guide your workflow?