There was a time when developing a data warehouse was sufficient to quench the thirst for data, reporting, and analytics of most business users. Not anymore. Organizations have discovered that data can be a valuable business asset. It has taken some time, but finally they realize they can do more with all the data that’s available than just produce simple reports. With the right data they can distinguish themselves from the competition, reduce costs by optimizing business processes, and create new business opportunities.
Data science, investigative analytics, self-service BI, embedded BI, streaming analytics are just a few of the many new forms of how data can be used and exploited. To support all these new forms of data usage, organizations are currently developing new systems, such as data lakes, data marketplaces, and data streaming systems. Unfortunately, most of these new systems are developed as stand-alone systems with almost no relationship with the existing data warehouse system.
In other words, organizations are developing systems that all deliver data to business users. Developing all these data delivery systems independently has two severe drawbacks:
It’s crucial that organizations, somehow, bring these data delivery systems together, to create one all-encompassing architecture. This unified architecture is responsible for delivering any form of data in any form to any business user.
This unified data delivery platform is probably not an extension of the well-known data warehouse system. It’s an architecture in which the data warehouse system operates as a module that delivers data to an umbrella architecture that deploys other technologies and systems to deliver data, such as a streaming system and a data lake. This data delivery platform unifies the concepts of data warehouse, data lake, data marketplace, streaming data, and any other data delivery system.
The foundation of this new data delivery platform must be abstraction. It must be able to hide for business users how and where data is stored, how it is copied, which technologies are used, whether data is integrated on-demand or on batch, and so on. In addition, it must be transparent enough to business users to determine how source data has been manipulated. A data delivery platform must be able to support a wide range of business users, ranging from users requiring governable and auditable reports, to users demanding a highly agile marketplace, and to data scientists who analyze raw data.
For the coming years, architecting an integrated data delivery platform will be the challenge for many organizations. If they don’t, their multitude of data delivery systems can lead to a labyrinth of systems that won’t allow them to get the most out of their data asset. Not everyone’s data thirst will be quenched.
Blog written by Rick Van der Lans, originally published here.