In the beginning, there was Statistics, and for a time, it was good: it allowed Fisher to measure farm productivity. Then came Computer Science, and for a time, it was good: combined with Stats, it allowed the allies to save lives and fuel in WW2. Then came the IT barbarians, with their tool-centric religions, fads and next-coolest-framework cult. Analytics went downhill from there.
Sorry for the inflammatory intro, but despite the hyperbole, the fact remains: IT has not helped Analytics to be percieved as real discipline with holistic approaches to problem-solving; rather, it has blurred the lines and caused it to mutate into that most despicable of terms: BigData.
Nowadays, we can even see BigData ads for IT tools in airplane mags! AIRPLANE MAGS! Right beside one of those perfume ads no one understands! It's so pervasive and full of hype, that we can now safely speak of a BigDataBubble, and because I like you people a lot, I submit for your consideration my 2-cent-needle with which I pretend to help bust it.
BigData, apart from a buzzword, is just the technology to store and move around large amounts of data. It precludes the data model (no model is also a model), the data pipelines (which update frequency must match our business' decision making frequency), and the underlying infrastructure.
Now, what we do with that data to glean knowledge is called 'Analytics'. Depending on the analytical maturity of your organization, analytics can be a simple moving average, or an ensemble-learning algorithm that fuses a random forest and a Kalman filter.
The former is the product of relatively recent technology developments. The latter has been around for over 20 years. The former is a product of IT (itself the practical application of Computer Science), while the latter is a holistic combination of Statistics (Bayesian or frequentist), Linear Algebra, Heuristic Programming and Computer Science. With the former we only have a large store sitting on our DataCenter doing nothing. With the later we have knowledge.
So, for the rest of this post (and of your lives), I implore you to speak of 'Analytics' when talking about the Big D. For the sake of our future children.
The natural path of both sides of this equation is for them to work together, but alas, along with the influence of IT, came the culture of those who practice it. And so, we now have preposterous headlines like 'R vs Python: BigData language showdown' (use both, or neither if they don't suit you) and java developers calling themselves data scientists (in the middle of the danger zone).
Don't get me wrong. I believe data literacy is key to become better citizens, better consumers, better professionals, but such a crucial skill must not be taken lightly and any would-be analyst must make sure to have the proper knowledge, the right mindset, and the correct CULTURE. Adopting a tool like if it were a religion, favoring code elegance over its contribution to business, and disregarding business' operational needs altogether is not the correct culture, and, sadly, this is what I've experienced with people from IT that apply for a position in my team.
Analytics (whether big or small) must be used to 3 ends and 3 ends only: 1) developing new products, 2) achieving operational efficiencies, and 3) support decision-making for the C-suite. At no time have I mentioned that a new product should be developed with spring-boot, or an algorithm that saves us £30m/year in fuel must be written in Python.
We must not to fall into the house of mirrors that is a distorted culture. We are analysts: we drive business decisions, we drive policy and even life-changing medical treatments. The least we could do is to avoid missing the forest for the trees.
Analytics is a lot more about improving the business and much less about the technology we use to that effect.
Sorry, IT vendors.
Read more here: http://www.slideshare.net/xuxoramos/big-data-big-disappointment