Data science is the result of a new paradigm taking place in IT. The question was raised recently, and here I explain how and why data science is part of this new paradigm, and not recycled material.
New arsenal of techniques and metrics
Many data science techniques are very different, if not the opposite of old techniques that were designed to be implemented on abacus, rather than computers. These new tools are often model-free.
For instance, new tools include
Indeed, old techniques such as logistic regression and classification trees don't even belong to data science, more stable techniques are used in data science. You can find many of them published as open intellectual property, in our data science research lab.
The way (big) data is processed has also dramatically changed: it requires optimizing complex Hadoop-like architectures, and computational complexity is not an issue any more in many cases (as long as you use efficient algorithms). It's the time that it takes for data to flow back and forth in data pipeline systems, that is now the bottleneck.
A truly new paradigm
Saying that data science is not creating a new paradigm shift, is like saying that if we claim Earth rotates around the sun rather than the other way around, there's no change in paradigm, because after all, we are still dealing with 2 celestial bodies and 1 rotation - nothing changed. According to this, using an abacus or a computer means no change in paradigm: we are still dealing with automated computations to obtain more value faster.
The change in paradigm that I am referring to, consists of moving away from models, to focus on data. It is the data-to-algorithm approach (bottom-up) rather than model-to-data (top-down), and in the process many old tools are becoming obsolete. It also involves working with messy, unstructured data.
Also, big data has caused an explosion in spurious correlations and wrong analyses / conclusions, by people still using the old paradigm. The new paradigm allows you to (just to name a few)
Why some people don't see the unfolding data revolution?
They might see it coming but are afraid: it means automating data analyses at a fraction of the current cost, replacing employees by robots, yet producing better insights based on approximate solutions. It is a threat to would-be data scientists.
In addition, data science unifies domains that were previously considered as independent silos, and adds its own research core, and delivers knowledge (e.g. open-source intellectual property) outside traditional academia. Data science is also a cross-disciplines field: it's an horizontal, not a vertical domain. It might appear that nothing new is created if you follow academic research closely, but the reason is because innovation is now done outside academia.