Subscribe to DSC Newsletter

No Holy Grail: Moving Beyond the Data Warehouse

Cloud data, cubes, relational databases, flat files, unstructured data... the list goes on and on. Ten years ago, many would have said that the future of data and business intelligence lay in unified, cohesive data warehouses where organizations could store all of their datasources (after ETL, of course). The future is here, but many organizations are finding that their databases are maturing and growing faster than their ability to integrate them into their data warehouse.

This creates a serious problem. Businesses must deliver new data into their business intelligence and analytics tools, but oftentimes their data warehouse is the defined standard of the organization. Standards can be exceedingly difficult to change or modify, not just technically but politically as well. 

There's no reason to kill the data warehouse
The solution may surprise you. It is not "the death of the data warehouse", but rather the acceptance of a multi-database ecosystem for analytics and reporting.

If an organization has relied on a data warehouse for transactional data, why change that? Most likely the system serves its function well and has already been paid for. The change comes when the organization adds a new data source. Let's use Salesforce as an example. Five years ago, the likely method for analyzing and using Salesforce data would have been to develop a process to bring Salesforce into the data warehouse and integrate it with all of the other data. Now with the advent of tools that can connect directly to Salesforce in the cloud, it makes much more sense to simply connect to Salesforce directly with your analytics or business intelligence tool. No ETL, no intermediary database, no extra time or expense.

Letting the new and the old join forces
After bringing the data warehouse and Salesforce data together, the trick is being able to use them in conjunction with one another. One of the most interesting innovations introduced in the most recent generation of business intelligence tools has been data blending: combining two or more datasources together by defining a common key within the analytics tool. In other words, skipping the ETL process as well as the intermediary database and simply connecting directly to each data source and combining them within the analytics or BI tool.

The benefit of this technique is that it significantly simplifies the process of adding additional data and databases. No longer will there be arguments about adding new data because of the challenge of integrating it with the data warehouse. As database technologies come and go, they can enter and leave the analytical process (via the analytics software itself) with little or no negative impact. The best part is that business users can finally have the flexibility to easily add the small datasets that add context to databases (like Excel files and CSV's). Data blending enables a fluidity of data that is nowhere near possible with a traditional "make it all fit" data warehouse.

Diversity is an asset, not a problem

To use an analogy, yes, it would be easier if everyone in the world simply spoke one language (or all the data lived in one database). But, as the complexity, history and culture of each language adds deeper meaning to the world, the differences and benefits of each datasource can add insight to your organization.

Full Disclosure
Ross Perez works at Tableau Software, a maker of Business Intelligence and data visualization software. 

Views: 1433

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service