Subscribe to Dr. Granville's Weekly Digest

Data Science: Fixing the Talent Shortage

Supposedly, there is a shortage of data scientists. Indeed few companies - LinkedIn, Google, Facebook, Apple, Intel, hot Start-ups or Twitter - get tons of applicants, but the other ones get little. Also applicants that companies are interested in, based on our tests (posting a few decoy data scientist resumes and applying to 300+ data science posted jobs), are as follows:

  • College degree from a prestigious university (Harvard, Berkeley, Stanford, Carnegie Mellon, MIT, Indian Statistical Institute, Imperial College, EPFL, etc.) 
  • Young, fresh out of college, or an employee from one of the previously mentioned companies. Ideally someone who did an apprenticeship or an an intership (Data Science Central currently has a nuclear physicist helping us with a number of projects, doing a post-doc at Columbia University, former graduate from EPFL) 

Pretty much everyone else is out of luck, based on our LinkedIn application tests. This being a field dominated by rather young males (75%) at least in mid-level manager roles, being a young attractive female definitely helps short-term, regardless of qualifications.

Hiring companies have a tunnel vision - I think caused by their recruiters who can't identify great potential (confusing it with a narrow skill set) - but also by company interviewers who are opposed to changes such as de-siloing or lean algorithms/infrastructures. Some managers and executives, favoring protectionism, will fight hard against the new paradigm until their retirement, to maintain their personal revenue streams rather than adapt, despite claiming the opposite. And frankly, should companies hire true creative disruptors? Nope, these people are just very good at competing against you, but not good at working for you - they make great entrepreneurs. Some of these interviewers feel that their job, as a manager, is jeopardized by the new breed of candidates. Also, some hiring managers don't really know how to recognize a real data scientist, since many university programs still produce fake data scientists, although this situation is positively evolving. And some applicants have no clues about the complex business model of the companies they apply for, which is a no-no unless your salary expectations are well below $50K/year.

It will take some time to change these perceptions, but one quick fix is as follows: developing or using extremely robust black-box or automated data science solutions. These solutions, developed by independent researchers such as Dr. Granville in the last few years, are now getting very mature. The ones developed 20 years ago by old-fashioned statisticians (and still very much in use), while very sophisticated, are notoriously sensitive to misuse by unqualified people - but not anymore today: new solutions are being designed to be efficiently used by robots (as in machine-to-machine communications), drones or zombies. They out-perform anything taught in standard university curricula or used in private companies, in terms of robustness and even accuracy; they are just as safe as automated driving (Google cars) -- scalable, robust, redundant and efficient -- ironically, their rely on similar paradigms to avoid failures. Many vendors are working on inplementing these solutions to provide enterprise analytic products that can be usef by non-statisticians and software engineers / developers, without causing over-fitting, failed predctions, or other drawbacks. These solutions typically come with light training to explain the few obvious parameters associated with these simple, scalable, adaptive, robust, noise-proofed, efficient algorithms, as well as training on how to identify or create sound metrics and data sets that will provide best ROI. 

Here are a few modern black-box techniques that will be released soon (or already available), as open intellectual property, in the upcoming free book Automated data Science published by the co-founder of Data Science Central, and forged in our private, self-funded, neutral, grant-independent data science research lab:

Views: 5020


You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Vincent Granville on August 13, 2014 at 9:46pm

Alex, I believe that some of these graduates that you mention  are great candidates to develop automated data science. There are cars without drivers - I mean without human being drivers, and they crash less frequently than those driven by humans. Same with planes, I've seen some military airplanes land in very deep fog in automated pilot mode.

Comment by Alex Esterkin on August 11, 2014 at 10:13pm

The ending should be in another blog.  I don't see how it may fix talent shortage. The best way to become a Data Scientist is to get a graduate degree in Computer Science or Mathematics focused on data processing, statistics, modeling, etc.   There are partially automated computer-guided surgery systems, but your still need a 100% educated surgeon to operate them.  To properly apply black box data science solutions might necessitate more education, not less - regardless of whether they are easy to use.    

Comment by Homer Roderick on August 11, 2014 at 11:42am

Wow! Many nails hit squarely on the head in a brief blog post. Exponential change is rounding the curve and many you describe will soon be wondering "what happened?" if they are clinging to inertia.

Follow Us



  • Add Videos
  • View All

© 2015   Data Science Central

Badges  |  Report an Issue  |  Terms of Service