Subscribe to DSC Newsletter

We are now at 9 categories after a few updates. Just like there are a few categories of statisticians (biostatisticians, statisticians, econometricians, operations research specialists, actuaries) or business analysts (marketing-oriented, product-oriented, finance-oriented, etc.) we have different categories of data scientists. First, many data scientists have a job title different from data scientist, mine for instance is co-founder. Check the "related articles" section below to discover 400 potential job titles for data scientists.

Categories of data scientists

  • Those strong in statistics: they sometimes develop new statistical theories for big data, that even traditional statisticians are not aware of. They are expert in statistical modeling, experimental design, sampling, clustering, data reduction, confidence intervals, testing, modeling, predictive modeling and other related techniques.
  • Those strong in mathematics: NSA (national security agency) or defense/military people working on big data, astronomers, and operations research people doing analytic business optimization (inventory management and forecasting, pricing optimization, supply chain, quality control, yield optimization) as they collect, analyse and extract value out of data.
  • Those strong in data engineering, Hadoop, database/memory/file systems optimization and architecture, API's, Analytics as a Service, optimization of data flows, data plumbing.
  • Those strong in machine learning / computer science (algorithms, computational complexity)
  • Those strong in business, ROI optimization, decision sciences, involved in some of the tasks traditionally performed by business analysts in bigger companies (dashboards design, metric mix selection and metric definitions, ROI optimization, high-level database design)
  • Those strong in production code development, software engineering (they know a few programming languages)
  • Those strong in visualization
  • Those strong in GIS, spatial data, data modeled by graphs, graph databases
  • Those strong in a few of the above. After 20 years of experience across many industries, big and small companies (and lots of training), I'm strong both in stats, machine learning, business, mathematics and more than just familiar with visualization and data engineering. This could happen to you as well over time, as you build experience. I mention this because so many people still think that it is not possible to develop a strong knowledge base across multiple domains that are traditionally perceived as separated (the silo mentality). Indeed, that's the very reason why data science was created.

Most of them are familiar or expert in big data. 

There are other ways to categorize data scientists, see for instance our article on Taxonomy of data scientists. A different categorization would be creative versus mundane. The "creative" category has a better future, as mundane can be outsourced (anything published in textbooks or on the web can be automated or outsourced - job security is based on how much you know that no one else know or can easily learn). Along the same lines, we have science users (those using science, that is, practitioners; often they do not have a PhD), innovators (those creating new science, called researchers), and hybrids. Most data scientists, like geologists helping predict earthquakes, or chemists designing new molecules for big pharma, are scientists, and they belong to the user category. 

Implications for other IT professionals

You (engineer, business analyst) probably do already a bit of data science work, and know already some of the stuff that some data scientists do. It might be easier than you think to become a data scientist. Check out our book (listed below in "related articles"), to find out what you already know, what you need to learn, to broaden your career prospects.

Are data scientists a threat to your job/career? Again, check our book (listed below) to find out what data scientists do, if the risk for you is serious (you = the business analyst, data engineer or statistician; risk = being replaced by
a data scientist who does everything) and find out how to mitigate the risk (learn some of the data scientist skills from our book, if you perceive data scientists as competitors)

Related articles

Views: 87020

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Dermot Cochran on December 23, 2015 at 7:04am

What do you call the last category e.g. a data science generalist or a true data scientist?

My career has been a mixture of those 6 different sub-fields, but only recently has it been called 'Analytics'.

Comment by Pradyumna S. Upadrashta on June 2, 2015 at 6:35am

Depends largely on the area of application.  However, I imagine that familiarity with math-stats as applied to statistical learning is common.

Comment by Hassan Ashraf on June 2, 2015 at 5:24am

What are the typical subject areas in Mathematics that Data Scientists tend to use more? Thank you !

Comment by Stephen Boesch on February 17, 2015 at 10:48pm

The above categories are too broad and too many.  The core members of the Data Scientist designation would mostly possess  an advanced degree in Math/Stats/AI or Machine Learning-focused CS or possibly some social science field.   There will be some persons coming from at least a Masters in STEM and several years of research oriented work in machine learning conducting data related experiments.  These would be needed  to qualify  as data "Scientists". Otherwise the "scientist" portion of the term will have been watered down.  (Note: those restrictions exclude *me* - i am qualified to work alongside DS's but would not for a second name myself as one).

Comment by Ihe Onwuka on February 8, 2015 at 1:32pm

There is a snag with the reasoning in the last paragraph and I can only speak to data engineering, software engineering and machine learning because that is my background. 

People who are genuinely good in these areas are actually quite rare. There is a lot more to being a really good programmer than knowing a few programming languages and knowing a few programming languages doesn't really count if they are all of the same type because  Java and C++ for example do not entail a different mindset and/or approach to problem solving. The practising programmer who actually learned how to program properly is the exception not the rule. 

When I took Andrew Ng's Coursera Machine Learning course he would repeatedly say from (I think) as  early as after he taught logistic regression, that people on the course already knew more than many machine learning practitioners in Silicon Valley.

Since I am not a mathematician or statistician I can't speak to what happens to (below) average performers in those disciplnes but in my field people can get by and produce something that works (or gives the appearance of doing so) despite being let's say distinctly average (I'm being kind).  Primarily this is because once something works or looks like it works the tendency is not to pay too  much attention into the technical debt accumulated as a result of skills deficiencies in the workmanship.

Comment by Pradyumna S. Upadrashta on January 5, 2015 at 7:05am

Daren Scot Wilson  Electronics would carry over well if you can leverage that as a specialilst in M2M ("machine to machine") connectivity, specifically as it relates to the IoT ("Internet of Things").  It's also useful to be able to model the "Things" in the IoT as an engineer might view them.  This would support Data Science within the IoT space, even if it is not strictly Data Science.  There are many lucrative opportunities supporting Data Science that for some reason have been ignored by the media.

Comment by Alex Esterkin on October 31, 2014 at 6:04pm

There is an overarching litmus test, and only those who can pass this litmus test may be called Data Scientists.

Data Scientists use the Scientific Method

This is a fundamental distinction.  I know that many members of this forum may disagree, but in my opinion, this is basic common sense.

Comment by Daren Scot Wilson on January 17, 2014 at 5:25pm

Very interesting.  Coming from physics, electronics, space science, graphics and visual arts, and wanting to move my career into Data Science, clearly data visualization is a good strength.  That was obvious before, though. What this helps me see more clearly is what other strengths based on my past experiences I could rely on for the near future. 

Although, I don't see how much of electronics would carry over.  Data Science isn't much about designing gadgets, is it?  OTOH, I keep seeing good overlap with astronomy and high energy physics - where I first learned about statistics beyond the basics.

Comment by Vincent Granville on January 17, 2014 at 3:09pm

A visual way to show how the different components interact would be by representing it in a graph (as in graph theory or graph data bases) where each node is a domain (computer science, statistics) or subdomain, and edges represent relationships (mother, daughter, sister) between the nodes, with weights indicating the strength of the association, for each edge. Multiple mothers are allowed for each node, so it's definitely more complicated than a tree structure.

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2016   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service