 |
A great restaurant- but not good for data |
Exploring the Problem
Granted, we're talking here in high generalities as to why business managers get frustrated by IT. Assuming existing Warehouses and BI front-ends exist and that warehouse "
projects that tried to use Inmon’s [warehouse design] approach have failed and none reported to succeed" and that "
more than 50% of today’s data warehouse projects are anticipated to fail" -- quality warehouses are probably not the norm and just having projects delivered within budget isn't the main sticking point.
Problems I see stem from infinitely complicated infrastructure - design gone astray. For instance, a simple request to track down data errors in a top fortune 50 company I am aware may literally involve:
- weeks and weeks of work with conversations and emails at a project level
- mostly because the whole process has to be reverse engineered from the BI front-end
- (managed by one offshore team),
- major work though coordinated researching of ETL processes
- (managed by another team in multiple time zones),
- coordination with data stewards and SQL troubleshooters who must diagnose questionable business rules
- scattered in layer after layer of incremental macro ETL code
- ....
It goes on and on, while the business manager just wanted either a data correction or perhaps a slight logic change.
 |
"Where did we go wrong?" "Why is it always so hard?" |
In this kind of example there is no promised delivery date, and updates of progress tend to be laden heavily with technical talk that sound more like excuses than reasons.
The whole problem stems from an over-complicated warehouse with overly dispersed controls. But nobody can quite point that out because that would then be admitting that the whole system is a failure. And nobody wants to do that.
Thus frustrations are the norm and managers tend to create their own mini-Mart warehouses instead while IT mounts even further layers of "governance" and "required approvals" for changes to try to keep the monster from mutating further.
It doesn't work, and the warehouse stagnates into minimal usability over time.
(quotes above from Nagesh.com )
Restructure your Data Warehouse Before It Turns to Mush
Road to Solution
To summarize better (ironic, huh?). DW is the horse that pulls the BI cart. IT should keep the horse healthy, flexible and fast. Business should build their own carts and be responsible for the maintenance and execution of them. If DW is well-designed and maintained, BI will be supported and true "business intelligence" can emerge that adapts to quick changes.
 |
A healthy data warehouse is the horse that pulls the BI cart |
Exactly HOW you do that may involve major re-engineering and is material for a larger article.
But I'll give an outline here. It involves:
- Requalifying or creating a solid MDM infrastructure and
- Allocating true human-by-name ownership for each major data-set - easily contactable
- FULLY CONFORMED or at least conformable dimensions AND data-sets
- Fact tables kept to the minimum in DW, expansions allowable outside DW in BI controlled data marts
- All DW fully accessible for all business use - no elitist control and all data visible
What this does is frees up IT to do what it should: Build stable and usable DW without mixing in all the BI tables and downstream extracted data. That should belong to business reporting teams with SQL and ETL skills of their own.
Essentially you have to separate the horse from the cart. And get the horse healthy. Then you can pull any cart you want loaded with WHAT you want.
Single Version of the Truth is a Myth - Concentrate on the Base Data
 |
Holy Grail of single truth data |
"Single version truth" has always been the holy-grail of reporting. I think it's time we rethink this. Truth is, I've never seen it achieved--for very long at least.
Truthfully I think we can have single version DATA instead. That gets holed up and maintained in MDM type single tables for every data set and dimension. Downstream Fact tables and reporting are only "truth" in the context of the slices and dices used adjusted by whatever business rules are applied. Different businesses sectors always prefer their "sliced with my business rules" version. It's when these various versions (that superficially appear the same) get compared that the executives cry out "give me one truth".
What this means is executives have to work again to find out who they trust. And determine exactly what data (with what slicing and business rules applied) do they what. They can even let reporting compete to allow the cream to arise to the top just like Google does with the Web. The web is a giant free-for-all mess of data sliced by logic versions. Yet nobody finds it difficult to get the data they want.
As an example. We can have MDM tables:
- Customers
- Business Units (divisions, sectors, etc.)
- Product Parts
- Warranty Types Sold
These are all distinct and finite at any given point in time. We want Parts Dispatch Reporting BI. This will always be subject to the particular Manager's preferred slice by warranty or customer type by his business unit. Nor is it a matter of a simple OLAP drill-able data architecture and the reason is because multiple business rules get embedded BETWEEN the layers of the concrete MDM tables and the "sliced as preferred" BI reporting layer. It will always be so. Certain managers won't consider certain parts dispatched under certain warranties as legit. Other managers don't consider certain customers coupled with certain warranty types. Truth at this level lies in however the data is spun.
The goal then for DW is to guarantee reliable concrete MDM type data. And allow business to spin it with whatever tools however they want.
Data is the realm of IT. Truth is another matter - best left for "the Business" to decide.
You need to be a member of Data Science Central to add comments!
Join Data Science Central