Have you noticed how many people are suddenly calling themselves data scientists? Your neighbour, that gal you met at a cocktail party — even your accountant has had his business cards changed!
There are so many people out there that suddenly call themselves ‘data scientists’ because it is the latest fad. The Harvard Business Review even called it the sexiest job of the 21st century! But in fact, many calling themselves data scientists are lacking the full skill set I would expect were I in charge of hiring a data scientist.
What I see is many business analysts that haven’t even got any understanding of big data technology or programming languages call themselves data scientists. Then there are programmers from the IT function who understand programming but lack the business skills, analytics skills or creativity needed to be a true data scientist.
Part of the problem here is simple supply and demand economics: There simply aren’t enough true data scientists out there to fill the need, and so less qualified (or not qualified at all!) candidates make it into the ranks.
Second is that the role of a data scientist is often ill-defined within the field and even within a single company. People throw the term around to mean everything from a data engineer (the person responsible for creating the software “plumbing” that collects and stores the data) to statisticians who merely crunch the numbers.
A true data scientist is so much more. In my experience, a data scientist is:
If you can find a candidate with all of these traits — or most of them with the ability and desire to grow — then you’ve found someone who can deliver incredible value to your company, your systems, and your field.
But skimp on any of these traits, and you run the risk of hiring an imposter, someone just hoping to ride the data sciences bubble until it bursts.
What would you add to this list? I’d love to hear your thoughts in the comments below.
About : Bernard Marr is a globally recognized expert in analytics and big data. He helps companies manage, measure, analyze and improve performance using data.
His new book is: Big Data: Using Smart Big Data, Analytics and Metrics To Make Better Decisions and Improve Performance You can read a free sample chapter here.
DSC Resources
Additional Reading
Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge
Comment
A data scientist has to have the ability to conceptualize a business process from the data collected and find the missing steps in the process which is essential to the process.
Why So Many ‘Fake’ Data Scientists?
On a Data Science website this question should be answered with Analysis.My follow up questions before we can achieve any insight:
Is there any pattern to the unqualified applicants? a school, ethnicity etc
What is the motivation? (other than $$)
Maybe the job requirements need to be more stringent requiring some certifications.
This is very true. You can not learn everything overnight which industry demands. Up until now , people were struggling to find the data , their relationship and struggling to map it with business need. Now struggling with analysis to which method to apply. Too much material to digest.
I do agree with Kevin Wang, and Jamie Lawson I'm a statistician and computer scientists (by work and degrees.) I teach a statistical course to undergraduate students titled "Intermediate Statistical Packages" in a statistical department, but the main purpose learn to answer the big question "So What" for a targeted audience. More precisely, create decisions and come up with an action plan. So what for a company to find out that its policies on bonus awards are inconstant? So what data analysis disproved tobacco companies statement that nicotine level is dependent of leaf size? Why do we have to wear seat belt, or have a air bags?
As a CS person, I will always choose a code that takes O(N log(N)) over one that takes an O(N^2). Furthermore, like Jamie implied, understanding the problem, helps one building a well designed and fast tool. It is all in the process.
Big data is huge volume of random data , can u say which & WHO's data are u analyzing ?? which pain of people u are u solving ? open data, bad & broken data ? how many e mail ids u use??
Great article! Interesting. I just analysed how popular specific skills are among Data Scientists using UK public LinkedIn profiles. - http://bit.ly/1SAiuQU
Good article. A data scientist in business is someone who has the skills above (I agree) but also can separate what would have happened ANYWAY from what happened because the business did things. This is the difference between the vast majority of BI reporting which hasn't really earned the "I" and data science which must isolate drivers from outcomes, signal from noise. Increasingly because testing everything is death by a thousand cuts, the ability to design and wrestle the data to the ground to get at the nuggets that are actually driving the business is really what it is all about. The 2nd piece is being able to tell the story in a compelling way...The narrative is very important.
What you call "computer science" here is really integration. Computer science is the business of understanding the deep nature of mathematical problems and solutions, particularly discrete problems such as those involving graphs. The essence of computer science is to examine a problem, find a solution, prove that the solution is valid and computationally efficient. Perhaps the best computer scientist I have known is Prof. Sara Baase, who was not a particularly gifted programmer, and was never up on all of the fad tools and languages, but could prove the properties of algorithms in a most lucid way. The utility of these skills in data science is profound. The well reasoned solution might work fine on an average computer while a less-well reasoned one might require fan out over a hundred processors and the infrastructure to support that. One wonders how much of the heavyweight solutions we live with today are heavy just because someone didn't do good computer science. I know that I inherited a project a couple of years ago that was waist deep in MapReduce and other difficult things, and it took overnight to deliver results. We rewrote it without deference to any particular tools and it ran in less than a minute on a laptop. All of the tools that were at some point employed to speed things up simply bulked up the solution.
Vassilios asked: "But why so many fake Data Science Job adds?"
There's an overabundance of job ads for the same reason there was so much over-hype around Big Data: Leadership and HR hear so much about it that they're afraid not to also have it, even though they don't really understand what "it" is. And in the case of data scientists, they have no clue what they're getting, what role on the analytic team they should really fulfill or what value they should expect.
So, they're grossly overpaying for a fairly ficticious role to "do analytics for the sake of analytics" and uncover some "interesting insights" that don't align to organizational goals or value. This will eventually right itself.. but in the meanwhile, it's a hayday for those who loosely don the title and claim they are pragmatic analysts.
© 2021 TechTarget, Inc.
Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More
Most popular articles
You need to be a member of Data Science Central to add comments!
Join Data Science Central