It speaks volumes of the world we live in today when headlines such as “The world’s most valuable resource is no longer oil, but data” and “Why Data May Be More Valuable Than Dollars” are commonplace. With the explosion of IoT and with that 2.5 quintillion bytes of data being created per day, the underlying power of this data comes as no surprise.
Unlike gold however, data is ubiquitous and being created at an exponential rate. So where’s the value in something that is everywhere? Unlike its “comparative commodity” like gold or money, the value does not come from the volume of data you have (although this is a good starting point!), rather the value comes from the insight you can gain from the data that you already have.
So how do you go about gaining insight from your data? This is the important question and the final goal that you need to keep in mind.
Previous methods of gaining insight from data include ETL (Extract, Transform, Load) which involved copying the data and loading it into a data warehouse. Data could then be extracted from this “data archive” to gain a business advantage.
While this method has been successful in the past, the need for the faster delivery of data insights is overtaking the capabilities of ETL and data warehousing for the following reasons:
The ETL process involves making copies of the data and then physically moving this copy and loading it into the data warehouse. This process, as well as the costs associated with storing the data in the warehouse, are very expensive.
Making copies of the data is problematic because you end up with a number of versions scattered across the enterprise which can quickly become unmanageable. Furthermore, if various teams are working independently from each other making changes to these copies, it becomes nigh on impossible to differentiate the original from the latest versions.
The time is takes to extract, clean and load into the data warehouse not only takes a long time, it also requires many hands on deck.
The data warehouse supports the similarity of data formats despite the fact that much of the data has come from different sources. This results in the homogenization of data which results in the loss of important value.
It takes a lot of time and energy to build a data warehouse, so any reorganization or business processes or changing source systems that occur later down the line will disrupt the end result. As you might imagine, the realignment of the data warehouse to match up with new business processes is no walk in the park.
The insight gleaned from data taken from a data warehouse is only valid at a certain point in time. This is due to the time delay caused by the ETL processes.
Nowadays with the influx of data from every facet of an organization, there has been a shift in the way we approach this data. The business advantage no longer lies in the amount of data that you have, rather how quickly you can use it to inform your important decisions.
The need for the faster delivery of data insights requires a technology which is advanced enough to be able to integrate and gain value from heterogeneous sources; agile enough to be able to accommodate changes to business processes without affecting the architecture and fast enough to provide solutions in real-time.
Data virtualization is one such technology. It connects all disparate data sources in real time and creates an independent abstraction layer which sits between the source and the consuming applications. Anyone within the enterprise who needs to access particular data can do so through the consuming applications, without being exposed to the data complexities (like the source, format or type of data). In this way the data becomes democratized: the insights made available for everyone enterprise-wide.