Big Data Realization: We are in transition phase


Recently found a post by Ali Syed on topic ; “Europeans unconvinced by big data in general“.
This study was conducted for the Vodafone Institute for Society and Communications by Kantar-owned market researchers TNS Infratest. It analyses over 8,000 individuals across eight European countries and offers valuable insight into people’s perceptions of big data and analytics

Key findings of actual survey by Voda phone says

Big Data Realization: We are in transition phase

Still there are people or groups don’t believe that we are living in the age of Big Data.  Other way round we can say they think Big Data is just a hype created by technology people. 

To me the concept of Big Data is not new. We are only becoming able to measure and somehow analyze them gradually. This is beginning, therefore, people are gradually getting convinced on Big Data. I remember in March 2013 while attending a conference on Big Data, there was a professor of Statistics arguing with people during Hi-Tea session and telling them there is no such Big Data because to apply any computation or statistical learning on it you need to reduce it. 

Encountering Big data is inevitable. One possible reason that people are unconvinced on Big Data is due to their inability to access it, sometime seeing is believing becomes the only solution. Big Data companies are releasing their secret tools and their massive datasets to push the world step forward and let them feel Big Data. In November 2015 Google released TensorFlow , a deep learning algorithm with big data then in January 2016 Yahoo Released the Largest-ever Machine Learning Dataset for Researchers.  This is 13.5 TB uncompressed data.  
Big Data Realization: We are in transition phase
According to director of research, Yahoo Labs,

“Many academic researchers and data scientists don’t have access to truly large-scale datasets because it is traditionally a privilege reserved for large companies, We are releasing this dataset for independent researchers because we value open and collaborative relationships with our academic colleagues, and are always looking to advance the state-of-the-art in machine learning and recommender systems.” 

We are in transition period, and there is nothing like absolute Big Data, this is a relative term. For some GB is Big Data and for some Petabyte is normal. It all depends on storage capacity, nature and ability of computation.

This ever growing dataset journey is continue but one can ask that:
Can we think about any useful largest possible dataset that we can store ever? Then my answer will be Yes, because such a dataset was dreamed by great scientist Laplase. 

Big Data Realization: We are in transition phase

Laplace’s demon states:
“We may regard the present state of the universe as the effect of its past and the cause of its future. An intellect which at a certain moment would know all forces that set nature in motion, and all positions of all items of which nature is composed, if this intellect were also vast enough to submit these data to analysis, it would embrace in a single formula the movements of the greatest bodies of the universe and those of the tiniest atom; for such an intellect nothing would be uncertain and the future just like the past would be present before its eyes.”— Pierre Simon Laplace, A Philosophical Essay on Probabilities

Big Data Realization: We are in transition phase

The same concept is rephrased by Stephen hawking as,

“In effect what he said was, that if at one time, we knew the positions and speeds of all the particles in the universe, then we could calculate their behaviour at any other time, in the past or future”. read more on this topic here

The big challenge to Laplase hypotheis is Heisenberg’s Uncertainty Principle but even Stephen Hawking says. “However, it was still possible to predict one combination of position and speed.”

If one day we become able to create such dataset then I’m sure no one will remain unconvinced 😉

There are number of possible useful largest datasets that we can store and I’ll love if you share your idea in comments.

Concretely, we are in transition period, people are becoming aware that datasets are relatively increasing and traditional data processing tools are not sufficient to deal with these emerging massive datasets.