Subscribe to DSC Newsletter

Why So Many ‘Fake’ Data Scientists?

Have you noticed how many people are suddenly calling themselves data scientists? Your neighbour, that gal you met at a cocktail party — even your accountant has had his business cards changed!

 

There are so many people out there that suddenly call themselves ‘data scientists’ because it is the latest fad. The Harvard Business Review even called it the sexiest job of the 21st century! But in fact, many calling themselves data scientists are lacking the full skill set I would expect were I in charge of hiring a data scientist.

What I see is many business analysts that haven’t even got any understanding of big data technology or programming languages call themselves data scientists. Then there are programmers from the IT function who understand programming but lack the business skills, analytics skills or creativity needed to be a true data scientist.

Part of the problem here is simple supply and demand economics: There simply aren’t enough true data scientists out there to fill the need, and so less qualified (or not qualified at all!) candidates make it into the ranks.

Second is that the role of a data scientist is often ill-defined within the field and even within a single company.  People throw the term around to mean everything from a data engineer (the person responsible for creating the software “plumbing” that collects and stores the data) to statisticians who merely crunch the numbers.

A true data scientist is so much more. In my experience, a data scientist is:

  • multidisciplinary. I have seen many companies try to narrow their recruiting by searching for only candidates who have a Phd in mathematics, but in truth, a good data scientist could come from a variety of backgrounds — and may not necessarily have an advanced degree in any of them.
  • business savvy.  If a candidate does not have much business experience, the company must compensate by pairing him or her with someone who does.
  • analytical. A good data scientist must be naturally analytical and have a strong ability to spot patterns.
  • good at visual communications. Anyone can make a chart or graph; it takes someone who understands visual communications to create a representation of data that tells the story the audience needs to hear.
  • versed in computer science. Professionals who are familiar with Hadoop, Java, Python, etc. are in high demand. If your candidate is not expert in these tools, he or she should be paired with a data engineer who is.
  • creative. Creativity is vital for a data scientist, who needs to be able to look beyond a particular set of numbers, beyond even the company’s data sets to discover answers to questions — and perhaps even pose new questions.
  • able to add significant value to data. If someone only presents the data, he or she is a statistician, not a data scientist. Data scientists offer great additional value over data through insights and analysis.
  • a storyteller. In the end, data is useless without context. It is the data scientist’s job to provide that context, to tell a story with the data that provides value to the company.

If you can find a candidate with all of these traits — or most of them with the ability and desire to grow — then you’ve found someone who can deliver incredible value to your company, your systems, and your field.

But skimp on any of these traits, and you run the risk of hiring an imposter, someone just hoping to ride the data sciences bubble until it bursts.

What would you add to this list? I’d love to hear your thoughts in the comments below.

About : Bernard Marr is a globally recognized expert in analytics and big data. He helps companies manage, measure, analyze and improve performance using data.

His new book is: Big Data: Using Smart Big Data, Analytics and Metrics To Make Better Decisions and Improve Performance You can read a free sample chapter here

Other Bernard Marr articles 

DSC Resources

Additional Reading

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge

Views: 53730

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Eric A. King on July 2, 2015 at 8:12am

This article was off to a great start - love the title!  But the ending is not realistic.  Here's my experience:

AGREE

- There's no consensus on what a 'data scientist' really is.

- Because the role is so hyped, everyone and their cousin is loosely donning the title... greatly diluting its value.

DISAGREE

- What you listed as the traits of a great candidate is not a person... it's a unicorn.  People who read this should not expect to hire that 'person.'  VERY few exist.  It's not realistic.  There are several people on an analytic team each of which will have some of the traits in your list to make up the full function (and then some).  And many who are required on the analytic team are already in-house.

- If a candidate is going to live up to the true "scientist" part, then they're likely an analytic practitioner.  The practitioner fulfills an important function on the analytic team, but is not a project leader (different role).  

-You included traits of an analytic project team lead -- which should not be expected from the analytic practitoner (who I think of as the data scientist) -- and true scientists are not typically interested in being leaders.  Moreover, most scientists and practitioners don't naturually have the soft skills and business acumen to fulfill several of the strategic bullets you included.

In the end, one should understand what is required to establish a full analytic TEAM -- and what the requirements for each role on the team are -- and the limited tactical role of the 'data scientist.'  On some teams, multiple people will serve a common role -- and multiple roles are sometimes filled by a person.  But, there are some functions that absolutely should not be fulfilled by the same person (or will lead to bias in an otherwise objective model, etc).

I find that the industry overall is terribly under-educated in how to establish a functional analytic practice.  They feel that they'll be more effective in analytics by hiring more 'data scientists.'  By doing so, they may build a better rocket... but they'll still have no mission plan or mission control.  They'll continue to do analytics for the sake of analytics.

However, starting to question the role, function and qualifications of the over-glorified 'data scientist' is at least a great place to start!

Comment by Peter Elliot on July 1, 2015 at 11:59pm

Its a great list - if a data scientist had all these attributes he/she would be worthy of the title. A combination of programming and statistical skills means a modern statistician - or a data scientist! Vlad - love the causation comment - its my pet peeve too. I like these new terms, they help bring understanding of new roles and technologies to non-IT people and acknowledge the changing IT landscape. Having managed data scientist teams I used to call myself an Analytics manager but now I would be an Insights manager - and I think that better describes my function.

Comment by Max Galka on July 1, 2015 at 6:18pm
It is a good post. But I would argue that data scientist does not even have a clear definition.

Some people would disagree with your list of skills and say that data scientists are basically statisticians.

Personally, I think anyone whose expertise is turning data into actionable information is a data scientist.
Comment by Milijan Mudrinic on July 1, 2015 at 1:09pm

I would add "fast learner" as a key skill. 

Comment by Chang Hsiung on July 1, 2015 at 9:49am

I am thinking to put a new title called "hopefully useful models builder" in my business card :)

Comment by Sione Palu on July 1, 2015 at 9:30am

Good article & I've been commenting here at DS Central on this issue for a while now that the term data-scientist is the latest fad.

My job title is one (ie, a data-scientist), but I tell people that I meet (if they ask) that I'm a computer programmer , because that's what I've always been doing for a long time. Also a computer programmer is familiar to people than the term data scientist. Some will ask a bit more "what sort of programming"  then I reply  "numerical computing", ie, everything from numerical simulation, statistical/algebraic computation, algorithm implementation (mostly based on latest publication from the literature). As I mentioned above, I think that the term data scientist is a fad & people jump on the bandwagon because it makes them sound or look awesome that the word "scientist" is attached to their title.

I think we have to understand that big data analytics have been there from decades ago long before the age of anyone with a job title can mean or be anything these days when the internet started, like scientists/physicists  working at Fermi Lab,  SLAC,  Los Alamos,  Sandia National Lab, Lawrence Livermore National Laboratory or CERN. The big data computation started by them going back to 50 years ago, which was long before anyone even heard of the term big data. Their mission (was) is to confirm theoretical physics' postulations or develop military hardware of their R&D for US Defense, which is different from the requirement of today's so called data scientist.

Comment by Chang Hsiung on July 1, 2015 at 8:33am

I just decided NOT to use the term "data scientist" in my new business card, even though I have been playing that kind of role in the past 15 years.

Comment by Ali Bakhtiari on July 1, 2015 at 8:09am

Couldn't agree more .... although as you partly mentioned I think some part of the problem goes back to the fact that to be a true data scientist you need to acquire a wide range skills, which are not necessarily easy to learn. But as there is short supply for such people companies mostly go with data analysts who present themselves as data scientist... The other issue is the depth of knowledge in each area. If I say I know Hadoop or Java or whatever just because I read a book (or even worse read few slides and watched few youtube) doesn't mean that I can use them effectively. That's why I don't think even fresh PhD graduate can call himself/herself a data scientist, as part of those skills needs to be learned through working in the field for at least few years. Anyhow, the sad part for me is that even the managers and companies don't have clear idea of what a data scientist can do for them, so they ask them to do what a data analyst should do...that's a waste

Comment by Roy W. Haas on July 1, 2015 at 7:46am

"statisticians who 'merely' crunch the numbers" demonstrates a profound ignorance of what statisticians actually do.

Comment by Benjamin Bertincourt on July 1, 2015 at 7:23am

It is interesting how the system works though, a lot of companies are looking to fill positions they are labeling "data scientists" while the recruiters will acknowledge on a one-on-one that really they need a big data engineer, or a data analyst. To the point that if you are looking to get a data scientist job while still building up to it, I would advise to bluntly call yourself a data scientist even if you feel you are lacking in an area. Just don't hide your shortcomings from recruiters.

Personally, coming from academic research, I have frankly no business knowledge. At best maybe a couple transferable skills to that area. I am more than willing to learn though I feel this is something you learn on the job. And to get that job ... Anyway, I tried labeling myself a data analyst, a research engineer, etc ... at the end of the day, data scientist on my resume and profiles gives me at least the opportunity to talk with recruiters and keep moving forward.

Videos

  • Add Videos
  • View All

© 2020   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service