Should you hire someone who knows all the most recent flavors of logistic regression? Or an Hadoop developer?
In my opinion, this is the wrong strategy. These employees are very expensive (at least $120k per year), and they might not bring the ROI that you expect. At least, if going in that direction, hire someone favoring simple, scalable, robust, automated solutions over anything else. To automate, you need someone great at developing extremely robust solutions that can handle faulty data, or processes that occasionally crash. These techniques will be described in our next book. People good at that usually hate mundane tasks so much that they always automate or outsource them - maybe you can start your hiring interview discussing this topic, asking what the candidate successfully automated, and how.
I believe you should hire someone with deep domain expertize, and strong analytic acumen. A guy who
Another solution is to outsource to vendors, and that's what we've been doing successfully at Data Science Central for a few years - all our data sources come from external vendors and data production / cleaning / summarization is automated. Then you can either hire a data scientist to process the data in question, or use robust data science tools (assuming you have someone in your team who can decide which analytic vendors are best for you - it's good to have two vendors, not just one), or have an executive with strong analytic acumen spend 5-10% of his time analysing these data summaries: that's what I do as the co-founder and CFO at Data Science Central, because I also have the whole financial silo-free picture, and that helps a lot when blending everything to make decisions; it really helps to have great interactive dashboards to access the data, usually your vendors provide these dashboards and you can access them as web apps anywhere in a plane or in your office. Sometimes the greatest insights do not show up in dashboards, so you need some vision to "see" what nobody sees, then get the dashboard improved. In my case, in one example, it involved creating a segmentation by ISP rather than by traditional customer segments to identify failing ISP having truly, significant negative impacts on marketing campaigns. This also illustrates the fact that one trustworthy person who knows all aspects of the business can sometimes be better than an entire team, as in 1 + 1 + 1 < 1 (such people are not that rare, but very few of them are geeks or classical data scientists).
So where do you find this kind of data science talent? The analytic acumen is a gift that at least 10% of the population has - though most who possess this gift (not something you can easily learn at school) are not data scientists; some are lawyers, psychologists, geographers, doctors, and it might be worth interviewing some of these people. Management consultants are also a goo bet. I will soon provide tests to assess true analytic acumen - though some questions are in my book already, and it's NOT the little traditional tricks such as the famous Microsoft interview question involving elevators (succeeding in these tests is not correlated with success at work, Google agrees with me on this).
Also, I believe it is a mistake to look for the top guru who knows many data mining techniques in details. Horizontal knowledge combined with domain expertize is better, as you will be dealing with a versatile employee (great for small companies). A candidate suspicious about predictive models, metric choices, and data quality, is worth interviewing - unless he/she can't bring change in a smooth way, or has no alternative to traditional predictive modeling. You need to find someone who can learn any useful techniques in a few hours, without his employer or anybody else having to spend money on expensive training - and test his/her Google search / information discovery skills during the job interview. You want to look for people who can self-learn the right stuff easily. The good news is that these people are far more numerous than data scientists, and less expensive. The purpose of my data science apprenticeship and our book is actually to create such people. You can also identify these potential hires by looking at what they publish in their blogs. So first, you need to identify these blogs. I will post a list of these blogs in the next 30 days.