Subscribe to DSC Newsletter

This question was recently posted by Larry Wasserman on the Normal Deviate blog (see extract below). Larry is a statistics and machine learning professor at Carnegie Mellon University.

Here is my answer:

Data science is more than statistics: it also encompasses computer science and business concepts, and it's far more than a set of techniques and principles. I could imagine a data scientist not having a degree - this is not possible for a statistician. But the core of the issue, in my opinion, is explained below.

  • I am one of the guys who contributes to the adoption of the keyword data science. Ironically, I'm a pure statistician (Ph.D. in statistics, 1993 - computational statistics) although I changed a lot since 1993, I'm now an entrepreneur. The reason I tried hard to move away from being called statistician to being called something (anything) else, is because of the American Statistical Association: they killed the keyword statistician as well as limiting career prospects to future statisticians, by making it almost narrowly and exclusively associated with the pharmaceutical industry and small data (where most of its revenue comes from). They missed the boat - on purpose, I believe - of the new statistical revolution that came along with big data over the last 15 years.
  • Statisticians should be very familiar with computer science, big data and software: 10 billion rows with 10,000 variables should not scare a true statistician. On the cloud (or on even on my laptop as streaming data), it gets processed real fast. First step is data reduction, but even if you must keep all observations and variables, it still is feasible. And good computer scientists also produce confidence intervals - you don't need to be statistician for that, just use the First AnalyticBridge Theorem (if you are curious, check out the Second AnalyticBridge Theorem). The distinction between computer scientist and statistician is getting thinner and more fuzzy over the years. The things you did not learn at school (in statistical classes), you can still learn it online.

This diagram misses a few key concepts - including business and domain knowledge

Here's the article:

As I see newspapers and blogs filled with talk of “Data Science” and “Big Data” I find myself filled with a mixture of optimism and dread. Optimism, because it means statistics is finally a sexy field. Dread, because statistics is being left on the sidelines.

The very fact that people can talk about data science without even realizing there is a field already devoted to the analysis of data — a field called statistics — is alarming. I like what Karl Broman says:

When physicists do mathematics, they don’t say they’re doing “number science”. They’re doing math.

If you’re analyzing data, you’re doing statistics. You can call it data science or informatics or analytics or whatever, but it’s still statistics.

Well put.

Maybe I am just pessimistic and am just imagining that statistics is getting left out. Perhaps, but I don’t think so. It’s my impression that the attention and resources are going mainly to Computer Science. Not that I have anything against CS of course, but it is a tragedy if Statistics gets left out of this data revolution.

Two questions come to mind:

1. Why do statisticians find themselves left out?

2. What can we do about it?

Read full article.

Related articles

Views: 27643

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by John Larimore on April 24, 2013 at 2:47pm

I am at the beginning of my career, and am definitely facing this ambiguity. I am getting my Bachelor's of Science in less than a month, and am in an absolute panic because my computing skills are limited to a single SAS Programming class and using R and Minitab in assignments. I am taking a MOOC in machine learning, got a JavaScript internship, and am generally scrapping to put together a strong enough computational skill set to compete. All of that said, if I could do it over I would have done more computing in college, but would not have changed majors. I am still glad I know the difference between probability distributions, know some of the theoretical connections between probability & statistics, etc. 

Comment by Andres Rincon on April 24, 2013 at 9:56am

Dear Vincent, I believe it is not the end of statistics, it is the beginning of somenthing larger. All the pieces that you depict on the diagram make sense to move to a much more complete point of view, however it is quite important to involve the statistician in all of them in order to be able to get a useful results.

 

If we do not realiaze about it maybe someone else will do it, but it depends on statistician to lead the other looking for a better understanding of the information.

Videos

  • Add Videos
  • View All

© 2020   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service

console.log("HostName");