Subscribe to DSC Newsletter

Why So Many ‘Fake’ Data Scientists?

Have you noticed how many people are suddenly calling themselves data scientists? Your neighbour, that gal you met at a cocktail party — even your accountant has had his business cards changed!

 

There are so many people out there that suddenly call themselves ‘data scientists’ because it is the latest fad. The Harvard Business Review even called it the sexiest job of the 21st century! But in fact, many calling themselves data scientists are lacking the full skill set I would expect were I in charge of hiring a data scientist.

What I see is many business analysts that haven’t even got any understanding of big data technology or programming languages call themselves data scientists. Then there are programmers from the IT function who understand programming but lack the business skills, analytics skills or creativity needed to be a true data scientist.

Part of the problem here is simple supply and demand economics: There simply aren’t enough true data scientists out there to fill the need, and so less qualified (or not qualified at all!) candidates make it into the ranks.

Second is that the role of a data scientist is often ill-defined within the field and even within a single company.  People throw the term around to mean everything from a data engineer (the person responsible for creating the software “plumbing” that collects and stores the data) to statisticians who merely crunch the numbers.

A true data scientist is so much more. In my experience, a data scientist is:

  • multidisciplinary. I have seen many companies try to narrow their recruiting by searching for only candidates who have a Phd in mathematics, but in truth, a good data scientist could come from a variety of backgrounds — and may not necessarily have an advanced degree in any of them.
  • business savvy.  If a candidate does not have much business experience, the company must compensate by pairing him or her with someone who does.
  • analytical. A good data scientist must be naturally analytical and have a strong ability to spot patterns.
  • good at visual communications. Anyone can make a chart or graph; it takes someone who understands visual communications to create a representation of data that tells the story the audience needs to hear.
  • versed in computer science. Professionals who are familiar with Hadoop, Java, Python, etc. are in high demand. If your candidate is not expert in these tools, he or she should be paired with a data engineer who is.
  • creative. Creativity is vital for a data scientist, who needs to be able to look beyond a particular set of numbers, beyond even the company’s data sets to discover answers to questions — and perhaps even pose new questions.
  • able to add significant value to data. If someone only presents the data, he or she is a statistician, not a data scientist. Data scientists offer great additional value over data through insights and analysis.
  • a storyteller. In the end, data is useless without context. It is the data scientist’s job to provide that context, to tell a story with the data that provides value to the company.

If you can find a candidate with all of these traits — or most of them with the ability and desire to grow — then you’ve found someone who can deliver incredible value to your company, your systems, and your field.

But skimp on any of these traits, and you run the risk of hiring an imposter, someone just hoping to ride the data sciences bubble until it bursts.

What would you add to this list? I’d love to hear your thoughts in the comments below.

About : Bernard Marr is a globally recognized expert in analytics and big data. He helps companies manage, measure, analyze and improve performance using data.

His new book is: Big Data: Using Smart Big Data, Analytics and Metrics To Make Better Decisions and Improve Performance You can read a free sample chapter here

Other Bernard Marr articles 

DSC Resources

Additional Reading

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge

Views: 29684

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Dalila Benachenhou on March 31, 2016 at 12:18pm

I do agree with  Kevin Wang, and Jamie Lawson I'm a statistician and computer scientists (by work and degrees.) I teach a statistical course to undergraduate students titled "Intermediate Statistical Packages" in a statistical department, but the main purpose learn to answer the big question "So What" for a targeted audience.  More precisely, create decisions and come up with an action plan.  So what for a company to find out that its policies on bonus awards are inconstant?  So what data analysis disproved tobacco companies statement that nicotine level is dependent of leaf size?  Why do we have to wear seat belt, or have a air bags?  

As a CS person, I will always choose a code that takes O(N log(N)) over one that takes an O(N^2).  Furthermore, like Jamie implied, understanding the problem, helps one building a well designed and fast tool.  It is all in the process.

Comment by Arindam Samanta on January 15, 2016 at 10:40am

Big data is huge volume of random data , can u say which & WHO's  data are u analyzing ?? which pain of people u are u solving ? open data, bad & broken data ? how many e mail ids u use?? 

Comment by Michael Shakhomirov on November 20, 2015 at 2:17am

Great article! Interesting. I just analysed how popular specific skills are among Data Scientists using UK public LinkedIn profiles. - http://bit.ly/1SAiuQU

Comment by Scott J Ulring on October 29, 2015 at 3:50am

Good article.  A data scientist in business is someone who has the skills above (I agree) but also can separate what would have happened ANYWAY from what happened because the business did things.  This is the difference between the vast majority of BI reporting which hasn't really earned the "I" and data science which must isolate drivers from outcomes, signal from noise.  Increasingly because testing everything is death by a thousand cuts, the ability to design and wrestle the data to the ground to get at the nuggets that are actually driving the business is really what it is all about.  The 2nd piece is being able to tell the story in a compelling way...The narrative is very important. 

Comment by Jamie Lawson on October 21, 2015 at 8:12pm

What you call "computer science" here is really integration. Computer science is the business of understanding the deep nature of mathematical problems and solutions, particularly discrete problems such as those involving graphs. The essence of computer science is to examine a problem, find a solution, prove that the solution is valid and computationally efficient. Perhaps the best computer scientist I have known is Prof. Sara Baase, who was not a particularly gifted programmer, and was never up on all of the fad tools and languages, but could prove the properties of algorithms in a most lucid way. The utility of these skills in data science is profound. The well reasoned solution might work fine on an average computer while a less-well reasoned one might require fan out over a hundred processors and the infrastructure to support that. One wonders how much of the heavyweight solutions we live with today are heavy just because someone didn't do good computer science. I know that I inherited a project a couple of years ago that was waist deep in MapReduce and other difficult things, and it took overnight to deliver results. We rewrote it without deference to any particular tools and it ran in less than a minute on a laptop. All of the tools that were at some point employed to speed things up simply bulked up the solution.

Comment by Eric A. King on July 26, 2015 at 7:39am

Vassilios asked: "But why so many fake Data Science Job adds?"

There's an overabundance of job ads for the same reason there was so much over-hype around Big Data:  Leadership and HR hear so much about it that they're afraid not to also have it, even though they don't really understand what "it" is. And in the case of data scientists, they have no clue what they're getting, what role on the analytic team they should really fulfill or what value they should expect.  

So, they're grossly overpaying for a fairly ficticious role to "do analytics for the sake of analytics" and uncover some "interesting insights" that don't align to organizational goals or value.  This will eventually right itself.. but in the meanwhile, it's a hayday for those who loosely don the title and claim they are pragmatic analysts.

Comment by Vassilios Rendoumis on July 26, 2015 at 2:18am

The post title seems a reasonable good question. But why so many fake Data Science Job adds?

Comment by Kevin Wang on July 25, 2015 at 2:02am

Bernard I've read many of your interesting articles but felt this one is not really accurate and grossly underestimate what a statistician does. A good applied statistician would have most if not all of the skillsets you defined for a "data scientist". In fact, a statistician does not just "merely crunch the numbers" nor "only presents the data". As a statistician who worked in the management consulting sector for a few years, I certainly didn't just "present the data" to my clients. I had to be "business savvy", "a story teller" and being "multidisciplinary" is also expected in a statistician.

Now....whether that also makes me a data scientist is a different story, but I'd like to think that I'm proud to be a statistician who does much more than just present the data!

PS -- I actually tried to post this on LinkedIn but had trouble commenting on your link on LinkedIn.

Comment by Orlando Vivin Tucker on July 22, 2015 at 6:14am

This was a very good post!  I have noticed much of this myself.  I especially like the part that says that a true data scientist can not only number crunch like a mere statistician, but can tell a story from that data that the company can benefit from.  I also think that a good data scientist can run away with prescriptive and predictive analytics.  We need more data scientists out there, but at the same time, we need them to be competent in their profession.

Comment by Life Skipper on July 7, 2015 at 9:48pm

very interesting and timely post. 

My 2 cents as a new data scientist (not so new in life though) and from my experience so far,(admittedly not a big one as a professional) :

I started studying as a hobby statistics,and after finishing a few courses ,i realised that i lack the math skills to go forward.

I had to go back and relearn calculus,before i really started understanding the underlying process in what i was doing in practice.

Fact is ,i ve seen so many "analyses" from "data scientists" and "statisticians" ,which if anyone bothered to research a litlle ,would discover the lack of any understanding of the subject matter.

I ve seen from "official" bodies reports that could have been made by some high school student ,and not a good one at that.

In all,what i see is that people who learn a few ways to manage data and make a table and throw some number in a machine to get some results dont really know what the details are and how they come into play in every case and scenario

Analyses about political  positions and beliefs is an example that jumps up first.

Living in Greece i was subjected to a vast amount of lies,of propaganda that was targeting the "average" citizen.

They report for citizens they consider illiterate in statistics and they tried and tried to cover up truths,and distort reality.And all ,because in my view,they lack the skills and experience to understand what these predictions do to society.Not only it is an irresponsible and maybe even illegal ,but it shows the lack of any real understanding of anything beyond some numbers on a paper.

I started in statistics ,pushed from the need to confront these lies and try to balance the public dialogue with informed opinions based on real facts and produce analyses that take into consideration all aspects that influence a situation.Without keeping anything out with intention to produce a false image or make a prediction with the sole purpose of directing public opinion.These are a danger not only to statistics and science ,but to society in general.

You wouldnt want a friend who would sell you in exchange for a few dollars as a slave.would you?

Why would you accept their analysis and advice then?why would you consider their predictions at all then?

If you know they are biased already.They are lies .Plain lies for a purpose.

So this is what i believe is a vital aspect of this conversation.To put humanity back into science.

To not become data nazis.(maybe overusing the nazi term this way to describe the situation)

To learn to take the human factor as the center of the analysis.

People who believe they know,or pretend they know,are contributing to an ideology of a new kind of "racism: this against the poor.

And yes ,it might be a problem these self declared data scientists for those of us who are after a job,but in my mind,this is not as important as human dignity ,self respect and humanism ,qualities that every scientist should have on top of his "scientific mind"

And yes i am a "leftie" as you d say in the western world.Not a communist though.not by a mile.I just want an equal ,shared,just world,as most people do (i hope thats still the case).

Sorry if i took you a bit of course,but i thought it would be interesting to input the human factor in a different way and talk about (data (and all other kinds of)) scientists responsibility towards our fellow humans,those we supposedly want to live our lives with in peace and prosperity....

Maybe i got it all wrong.Maybe we should all be trying to eat each other.If thats the case,please eat me first.:)

best regards to all

George

half way through data science path 

PS:Noticeable is the ease with which ,some people who sit on teaching positions in universities around the world,(not in major ones usually) have each one created their own measures corrections and scales,and ways to go about what we d call analysis.The age old arguing between bayesian frequentist approach is just an example of what i m talking about.So if one wants to call him/herself scientist ,in my view should have a sound understanding of all approaches and take them all into consideration if one wants to be scientifically acting.:):)

If there is an artistic element in data science ,this in my view is the ability to perform analysis and prediction that is actually relevant and seeks answers to better lives for as many as possible without hurting anyone ,not make them less humane

.Any other purpose ,for me science does not have.(to explain for the explanation is nothing,if this explanation does not better human life....:):)

Sorry for taking up so much space!!:):)

again all the best!

G

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2016   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service