The Forrester report “Predictions 2018: A year of reckoning” predicted that 80% of firms affected by the GDPR will not be able to comply with the regulation by the time it comes into force on May 25. Of those noncompliant firms, 50% will intentionally not comply.
Compliance does not have to be difficult. Imagine if it were possible to facilitate GDPR compliance with a mature technology that facilitates the significant reduction in costs associated with traditional replication methods of data integration. This is possible! Data virtualization is a mature, cost-effective technology that enables the three pillars of GDPR compliance.
GDPR compliance requires the right combination of people, processes, and technology. From a technology perspective, no single tool can implement compliance; Companies need multiple technologies, such as data governance, data quality, and data integration solutions. Data virtualization can play an important role in this compliance ecosystem, supporting the three most important pillars of GDPR compliance:
1. A Single View of Data Entities — The ability to create a complete view of a data subject in an agile manner, which enables companies to present the most accurate data about an individual
2. A Self-Service Data Catalog — enabling companies to find the data assets that are in scope for GDPR compliance, and then categorize, annotate, and document them
3. Privacy by Design — ensuring that privacy is considered during the design process so that data delivery and access mechanisms are compliant from the outset, rather than added in retrospect.
Companies may have existing master data management (MDM) systems that consolidate and master entities, but these lack the context of the transactional information that may exist in other systems. Data virtualization is ideally suited to combine the sources of master data with transactional information distributed around the enterprise to provide a full 360-degree view of up-to-date customer information.
Data virtualization greatly assists with providing single views of entities like customers and is typically used in MDM initiatives in three ways:
Often, self-service initiatives designed to provide easier access to data for consumers fail to deliver. Exposing more data sources to consumers ends up in more requests to IT, more complexity and backlogs, and more data replication, resulting in increased governance and compliance issues.
Data virtualization provides a solution to this: exposing curated information in a business-friendly form that is easy to access. However, unlike traditional approaches, which physically consolidate and transform the data, approaches that are also slow and expensive to maintain, data virtualization provides a more agile means of virtually consolidating the data.
With data virtualization, data sources can be quickly exposed through a curated logical layer via a canonical model that is more easily consumed and understood by the business. Business analysts and developers are then able to build their own views of the data based on this model to suit their needs, with less reliance on IT.
To support this, data virtualization platforms provide a data catalog that exposes the metadata of the curated views defined in the virtual layer. This allows users to discover the data that is available in the virtualization layer through a web browser, promoting re-use of the data and self-service for business users.
From a compliance perspective, users can use the catalog to categorize and annotate information published via the views provided in virtual layer, by adding business terms and descriptions, business categories, and tags. These can be searched, together with view metadata and lineage information, enabling companies to understand where their sensitive information resides and how it is published to consumers. Combined with the operational audit information provided by data virtualization platforms, which identify which views were created or are being queried and by whom in the virtual layer, this greatly facilitates compliance and data governance processes.
Data catalogs should be tightly linked with the data delivery infrastructure so that data consumers can easily access the most up-to-date data once they have discovered it. With a data virtualization solution, the data catalog provides a direct view on the data delivery layer itself (i.e. the data virtualization layer), so once stakeholders have located relevant data objects, they can immediately inspect the data without switching to a different tool.
The GDPR requires companies to ensure that regulatory compliance is included in the design of any data processing initiative.
Businesses need access to complete, pre-integrated information. The challenge is that data is becoming more and more distributed across the enterprise in different formats and technologies. Companies resort to physical replication and point-to point interfaces to provide the integrated information, but this creates unnecessary duplicates.
With all this distribution and data duplication, compliance with GDPR principles such as ensuring the data is processed in a manner that ensures appropriate data security, or that the data is kept in a form that enables the subject to be identifiable no longer than is necessary, or that the data you are viewing is accurate and up-to-date, can be difficult.
In addition, companies face the added technical challenges of integrating highly distributed data across on-premises and cloud sources, proving that data access has been audited effectively, providing access to the most up-to-date data, anonymizing the data, and providing consistent data security, not to mention the challenge of managing multiple data copies.
Data virtualization simplifies this integration landscape and enables data sources to be brought quickly under control by publishing them through a virtual abstraction layer. Data virtualization offers a single logical point of access, avoiding point-to-point connections from consuming applications to data sources. As a single point of access for applications, it is the ideal place to enforce access and security restrictions that can be defined in terms of the published logical model, with a fine level of granularity.
Data virtualization platforms include user and role-based authentication and authorization mechanisms that can limit access at a schema level and implement data-specific permissions. User access can, for example, be easily limited to specific data views or specific rows or columns in a virtual view. Data can even be selectively masked for specific attributes.
To comply with the GDPR, companies need to be able to ensure that data is used in a lawful way. So, not only is it required to restrict access to appropriate consumers, but companies also need to keep track of who is accessing what information. Enabling data access through a virtual layer enables organizations to audit who is using their data, from a single point of control. The Denodo Platform, for example, provides an audit trail of all the queries and other actions performed on the virtual layer. At any time, companies can check who is accessing the data, what queries they executed, and what changes they made to the virtual model.
In summary, using an abstraction layer provided by data virtualization technology facilitates GDPR compliance. Data can be quickly brought under control and consumed through a single virtual access layer to facilitate GDPR initiatives, helping to bring context to master data, ensuring that data is easily discoverable for new initiatives and governance processes, all while reducing the governance headaches associated with data replication.
This blog was originally published here.