Supposedly, there is a shortage of data scientists. Indeed few companies - LinkedIn, Google, Facebook, Apple, Intel, hot Start-ups or Twitter - get tons of applicants, but the other ones get little. Also applicants that companies are interested in, based on our tests (posting a few decoy data scientist resumes and applying to 300+ data science posted jobs), are as follows:
Pretty much everyone else is out of luck, based on our LinkedIn application tests. This being a field dominated by rather young males (75%) at least in mid-level manager roles, being a young attractive female definitely helps short-term, regardless of qualifications.
Hiring companies have a tunnel vision - I think caused by their recruiters who can't identify great potential (confusing it with a narrow skill set) - but also by company interviewers who are opposed to changes such as de-siloing or lean algorithms/infrastructures. Some managers and executives, favoring protectionism, will fight hard against the new paradigm until their retirement, to maintain their personal revenue streams rather than adapt, despite claiming the opposite. And frankly, should companies hire true creative disruptors? Nope, these people are just very good at competing against you, but not good at working for you - they make great entrepreneurs. Some of these interviewers feel that their job, as a manager, is jeopardized by the new breed of candidates. Also, some hiring managers don't really know how to recognize a real data scientist, since many university programs still produce fake data scientists, although this situation is positively evolving. And some applicants have no clues about the complex business model of the companies they apply for, which is a no-no unless your salary expectations are well below $50K/year.
It will take some time to change these perceptions, but one quick fix is as follows: developing or using extremely robust black-box or automated data science solutions. These solutions, developed by independent researchers such as Dr. Granville in the last few years, are now getting very mature. The ones developed 20 years ago by old-fashioned statisticians (and still very much in use), while very sophisticated, are notoriously sensitive to misuse by unqualified people - but not anymore today: new solutions are being designed to be efficiently used by robots (as in machine-to-machine communications), drones or zombies. They out-perform anything taught in standard university curricula or used in private companies, in terms of robustness and even accuracy; they are just as safe as automated driving (Google cars) -- scalable, robust, redundant and efficient -- ironically, their rely on similar paradigms to avoid failures. Many vendors are working on inplementing these solutions to provide enterprise analytic products that can be usef by non-statisticians and software engineers / developers, without causing over-fitting, failed predctions, or other drawbacks. These solutions typically come with light training to explain the few obvious parameters associated with these simple, scalable, adaptive, robust, noise-proofed, efficient algorithms, as well as training on how to identify or create sound metrics and data sets that will provide best ROI.
Here are a few modern black-box techniques that will be released soon (or already available), as open intellectual property, in the upcoming free book Automated data Science published by the co-founder of Data Science Central, and forged in our private, self-funded, neutral, grant-independent data science research lab: