Subscribe to DSC Newsletter

Here is my top 7 list of daft things that some people say about Big Data.

I think that Big Data does play a role in some businesses. I also think that some of the basic distributed file store and text search technologies can be usefully employed, in non-traditional indexing, counting and correlation. However, there is an awful lot of nonsense said about Big Data.

So, onwards and upwards.

Big Data is like currency

If Big Data is currency, and for most of us, it isn't, then it's more like the hyperinflationary money of the Weimar Republic, rather than something you would take to the bank or try and buy the weekly grocery with.

Big Data might have value, no doubt some of it does – it can't all be dross, can it?But, that doesn't make it a solid financial asset class – that's just dopey. The value of data providers, such as those companies supplying financial market and instruments data, is in the service of providing accurate, appropriate and timely data. Data has no significantly greater intrinsic financial value or exchange liquidity than pints of beer or glasses of wine. It can have time and place utility, of course, as a product or as a service, but it is not like a central-bank backed currency – not even close.

Big Data contains gold

Call me an old-fashioned cynic, if you must, but I don't believe for one moment that we have now learned how to turn lead into gold, or for that matter, Big Data into golden nuggets.

You see, if Big Data contains gold, and it doesn't, then it would be more like FeS2 than anything else. Or, to use the vernacular, it would be more like fool's gold than Welsh gold. Which, considering the quantity of hype surrounding Big Data, is the most apt analogy.

Big Data is for everybody

If you want to see what is important for everyone, then start with Maslow's hierarchy of needs, a pyramid of essential human motivators that curiously hasn't been expanded to include Big Data. Maybe because the fact that it's not an essential human need.

I would also like to mention that quite a number of Big Data projects have nothing to do with Big Data or Hadoop at all.

Big Data is solving fundamental world problems

Doing professional taxi-drivers out of a living income is not solving world problems, no matter how many times one chant's Big Data Uber alles.

We know what the fundamental world problems are, and we know, more or less how to solve them. In this respect we don't need Big Data to tell us which way is up. Consider this:

Maybe Big Data can inform us that...

  • 1 child dies every 4 seconds
  • 14 children die every minute
  • A 2011 Libya conflict-scale death toll every day
  • A 2010 Haiti earthquake occurring every 10 days
  • A 2004 Asian Tsunami occurring every 11 days
  • An Iraq-scale death toll every 19–46 days
  • Just under 7.6 million children dying every year
  • Some 92 million children dying between 2000 and 2010

"The silent killers are poverty, hunger, easily preventable diseases and illnesses, and other related causes. Despite the scale of this daily/ongoing catastrophe, it rarely manages to achieve, much less sustain, prime-time, headline coverage." Source: Global Issues

Maybe if we knew all of this (and we really do) then maybe we can do something about tackling the problems.

Maybe we should also be less eager to instrumentalise real suffering in order to flog aging technology and price-gouging services, and actually do something about problems we know of, using solutions that are available to us, to the benefit of those less fortunate than us.

Big Data is new

Big Data is characterised by its volumes, its velocities of generation and communication and its varieties.

The thing is, data has always been growing in volumes, and there has never been a time when it has actually decreased. Also, data has been increasingly generated at a faster rate, a trend that doesn't look like stopping anytime soon. Lastly, the varieties in format and content of data objects has been growing since the early eighties (that much I can vouch for), and to some significant extent, before then as well.

The thing is, nothing of significance is new. Not that this matters. But data has always about volumes, types and velocity, and those factors didn't suddenly become relevant at the turn of the millennium. Neither is the technology new, most of the technology labelled as Big Data technology is a collection and configuration of technologies that are decades old.

Big Data will replace the Data Warehouse

To be precise, this is about variations on the theme of distributed file systems and text search and count (e.g. Hadoop) replacing Enterprise Data Warehousing.

There are two axioms that can be applied when considering Hadoop and Data Warehousing:

  1. If you can replace your Data Warehouse technology with Hadoop then your 'Data Warehouse' is not a Data Warehouse in any sort of Inmon (or even Kimball) way.
  2. If you structure your EDW data in 3NF and your data mart data using dimensional modelling, then Big Data technology will not be able to compete with existing technologies used to support EDW in the ways that relational products from Oracle, IBM, Microsoft, Teradata and EXASolutions, can.

In short, using Hadoop as part of the tech stack for an Analytics Data Store makes absolute sense. Trying to shoehorn a mature strategic and tactical decision support platform into Hadoop is just daft.

Whilst we are on the subject, even staging data with Hadoop is overkill, and simply adds another point of failure in the Data Warehouse process.

As for doing ETL with Hadoop and Java? Please….

Big Data is a universal panacea

There are legitimate and ethical proponents of Big Data and indeed some applications seem, at least superficially, to make some good sense. However, there are an awful lot of unscrupulous or wilfully uninformed Big Data punters around, who are in my view giving the impression that there are no limits to what can be achieved with Big Data. As if Big Data were the ultimate universal panacea.

Of course, to put it simply, it's not true, it never was, nor is it likely that this will ever be the case.

That's all folks

It has been proven that Big Data technology is not only useful for some companies, but it's absolutely essential for them. However, not everyone will have the same or similar business models and drivers as Google, Twitter, Facebook and Amazon. In fact, most other businesses are not really like these internet advertising companies in any significant way when it comes to the data they collect and the ways that they can apply it. That is, very few other companies are basing their business models on advertising revenue streams, data brokerage and search.

That stated, efficient distributed file stores, text search and word counting will have their place, it's just that this place isn't for everyone.

Many thanks for reading

As always, please share your questions, views and criticisms on this piece using the comment box below. I frequently write about strategy, organisational leadership and information technology topics, trends and tendencies. You are more than welcome to keep up with my posts by clicking the ‘Follow’ link and perhaps even send me a LinkedIn invite. Also feel free to connect via TwitterFacebook and the Cambriano Energy website.

For more on this and other topics, check out my other recent posts:

#Hadoop #BigData #BigDataAnalytics #Decency #Ethics

Views: 2831

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Randy Bartlett on June 16, 2015 at 9:17am

Agreed, many businesses are not quite analytically mature enough to harvest Big Data.  

I get it, 'Big Data uber alles!' 

Comment by Vladimir Sevastyanov on June 15, 2015 at 1:26pm

Martyn, I understand your point - it is based on current condition of the big data science, which makes very first steps, and brings more questions than answers. We are trying to use algorithms of Applied Statistics to large datasets, and have very questionable results. But the algorithms have been designed for analysis of small datasets of highly reproducible physical experiment, and do not work well with extremely noisy large datasets. This is just one of many challenges typical for Big Data, and preventing us from "turning leads into gold."

I believe that large volume of data creates previously unknown opportunities to develop new technologies, which would allow us to consider any business as an object of optimization. This would allow improving any business based on its data, which in turn would make Big Data work like a currency.

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2017   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service