When it comes to Data governance I remember Mark Twin phrase “Lies, damned lies, and statistics”. One side several business leaders are still exploring “What to do” and “How to do” data governance, which data to consider, which tools are available and on the other side there are complex regulatory compliance like HIPAA, SOX and Basel II hanging as sword.
DG especially in Big Data occasionally perceived as lie by a few and dammed lie by other few BUT when done properly not only this solves governance problem but also improve the data quality. Simple goal of DG is to govern how data can be accessed and used via business initiatives, as well as defined and managed via data management infrastructure.
So what have we built?
We have built a multi-tenant Healthcare Analytics Platform on the Hortonworks Big Data Stack. The platform receives messages from multiple devices, from multiple tenants (in this case, it is the hospitals). Usual messages received are, from the devices attached to the patients in the high or low acuity areas, from the ventilators, from the laboratories, ADT messages. Our flagship product ‘LogiCrunch’ processes, predicts, publishes the patient’s condition in real-time to the clinicians (respective tenants’).
Listed below are the key governance based activities organized by Phases:
Measure and Monitor:
Hortonworks Big Data Governance Stack
OpenLDAP (trust established between tenants and the platform) for authentication, integrated with Knox
PostgreSQL (Authorization at the Web layer)
Elastic Search (Web Access logs)