Why So Many ‘Fake’ Data Scientists?

Have you noticed how many people are suddenly calling themselves data scientists? Your neighbour, that gal you met at a cocktail party — even your accountant has had his business cards changed!


There are so many people out there that suddenly call themselves ‘data scientists’ because it is the latest fad. The Harvard Business Review even called it the sexiest job of the 21st century! But in fact, many calling themselves data scientists are lacking the full skill set I would expect were I in charge of hiring a data scientist.

What I see is many business analysts that haven’t even got any understanding of big data technology or programming languages call themselves data scientists. Then there are programmers from the IT function who understand programming but lack the business skills, analytics skills or creativity needed to be a true data scientist.

Part of the problem here is simple supply and demand economics: There simply aren’t enough true data scientists out there to fill the need, and so less qualified (or not qualified at all!) candidates make it into the ranks.

Second is that the role of a data scientist is often ill-defined within the field and even within a single company.  People throw the term around to mean everything from a data engineer (the person responsible for creating the software “plumbing” that collects and stores the data) to statisticians who merely crunch the numbers.

A true data scientist is so much more. In my experience, a data scientist is:

  • multidisciplinary. I have seen many companies try to narrow their recruiting by searching for only candidates who have a Phd in mathematics, but in truth, a good data scientist could come from a variety of backgrounds — and may not necessarily have an advanced degree in any of them.
  • business savvy.  If a candidate does not have much business experience, the company must compensate by pairing him or her with someone who does.
  • analytical. A good data scientist must be naturally analytical and have a strong ability to spot patterns.
  • good at visual communications. Anyone can make a chart or graph; it takes someone who understands visual communications to create a representation of data that tells the story the audience needs to hear.
  • versed in computer science. Professionals who are familiar with Hadoop, Java, Python, etc. are in high demand. If your candidate is not expert in these tools, he or she should be paired with a data engineer who is.
  • creative. Creativity is vital for a data scientist, who needs to be able to look beyond a particular set of numbers, beyond even the company’s data sets to discover answers to questions — and perhaps even pose new questions.
  • able to add significant value to data. If someone only presents the data, he or she is a statistician, not a data scientist. Data scientists offer great additional value over data through insights and analysis.
  • a storyteller. In the end, data is useless without context. It is the data scientist’s job to provide that context, to tell a story with the data that provides value to the company.

If you can find a candidate with all of these traits — or most of them with the ability and desire to grow — then you’ve found someone who can deliver incredible value to your company, your systems, and your field.

But skimp on any of these traits, and you run the risk of hiring an imposter, someone just hoping to ride the data sciences bubble until it bursts.

What would you add to this list? I’d love to hear your thoughts in the comments below.

About : Bernard Marr is a globally recognized expert in analytics and big data. He helps companies manage, measure, analyze and improve performance using data.

His new book is: Big Data: Using Smart Big Data, Analytics and Metrics To Make Better Decisions and Improve Performance You can read a free sample chapter here

Other Bernard Marr articles 

DSC Resources

Additional Reading

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge

Views: 53948


You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Tom Ke Tao on March 25, 2017 at 11:27am

A data scientist has to have the ability to conceptualize a business process from the data collected and find the missing steps in the process which is essential to the process.

Comment by Michael Akinwumi on March 23, 2017 at 3:23pm
I agree that supply deficit may be a contributory factor.
Comment by Vijay Sarma on August 24, 2016 at 9:47am

Why So Many ‘Fake’ Data Scientists?
On a Data Science website this question should be answered with Analysis.My follow up questions before we can achieve any insight:
Is there any pattern to the unqualified applicants? a school, ethnicity etc
What is the motivation? (other than $$)
Maybe the job requirements need to be more stringent requiring some certifications.

Comment by Harshendu Desai on August 23, 2016 at 11:13am

This is very true. You can  not learn  everything  overnight which  industry demands. Up until now , people were struggling  to find the data , their relationship  and struggling to  map it with business need.  Now struggling  with  analysis to which method to apply. Too much material to digest.  

Comment by Dalila Benachenhou on March 31, 2016 at 12:18pm

I do agree with  Kevin Wang, and Jamie Lawson I'm a statistician and computer scientists (by work and degrees.) I teach a statistical course to undergraduate students titled "Intermediate Statistical Packages" in a statistical department, but the main purpose learn to answer the big question "So What" for a targeted audience.  More precisely, create decisions and come up with an action plan.  So what for a company to find out that its policies on bonus awards are inconstant?  So what data analysis disproved tobacco companies statement that nicotine level is dependent of leaf size?  Why do we have to wear seat belt, or have a air bags?  

As a CS person, I will always choose a code that takes O(N log(N)) over one that takes an O(N^2).  Furthermore, like Jamie implied, understanding the problem, helps one building a well designed and fast tool.  It is all in the process.

Comment by Arindam Samanta on January 15, 2016 at 10:40am

Big data is huge volume of random data , can u say which & WHO's  data are u analyzing ?? which pain of people u are u solving ? open data, bad & broken data ? how many e mail ids u use?? 

Comment by Michael Shakhomirov on November 20, 2015 at 2:17am

Great article! Interesting. I just analysed how popular specific skills are among Data Scientists using UK public LinkedIn profiles. - http://bit.ly/1SAiuQU

Comment by Scott J Ulring on October 29, 2015 at 3:50am

Good article.  A data scientist in business is someone who has the skills above (I agree) but also can separate what would have happened ANYWAY from what happened because the business did things.  This is the difference between the vast majority of BI reporting which hasn't really earned the "I" and data science which must isolate drivers from outcomes, signal from noise.  Increasingly because testing everything is death by a thousand cuts, the ability to design and wrestle the data to the ground to get at the nuggets that are actually driving the business is really what it is all about.  The 2nd piece is being able to tell the story in a compelling way...The narrative is very important. 

Comment by Jamie Lawson on October 21, 2015 at 8:12pm

What you call "computer science" here is really integration. Computer science is the business of understanding the deep nature of mathematical problems and solutions, particularly discrete problems such as those involving graphs. The essence of computer science is to examine a problem, find a solution, prove that the solution is valid and computationally efficient. Perhaps the best computer scientist I have known is Prof. Sara Baase, who was not a particularly gifted programmer, and was never up on all of the fad tools and languages, but could prove the properties of algorithms in a most lucid way. The utility of these skills in data science is profound. The well reasoned solution might work fine on an average computer while a less-well reasoned one might require fan out over a hundred processors and the infrastructure to support that. One wonders how much of the heavyweight solutions we live with today are heavy just because someone didn't do good computer science. I know that I inherited a project a couple of years ago that was waist deep in MapReduce and other difficult things, and it took overnight to deliver results. We rewrote it without deference to any particular tools and it ran in less than a minute on a laptop. All of the tools that were at some point employed to speed things up simply bulked up the solution.

Comment by Eric A. King on July 26, 2015 at 7:39am

Vassilios asked: "But why so many fake Data Science Job adds?"

There's an overabundance of job ads for the same reason there was so much over-hype around Big Data:  Leadership and HR hear so much about it that they're afraid not to also have it, even though they don't really understand what "it" is. And in the case of data scientists, they have no clue what they're getting, what role on the analytic team they should really fulfill or what value they should expect.  

So, they're grossly overpaying for a fairly ficticious role to "do analytics for the sake of analytics" and uncover some "interesting insights" that don't align to organizational goals or value.  This will eventually right itself.. but in the meanwhile, it's a hayday for those who loosely don the title and claim they are pragmatic analysts.

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service