Data Science and EU Privacy Regulations: A Storm on the Horizon - Part 2

In Part 1, we introduced a pending EU privacy and data protection regulation (the GDPR) which will carry fines for violations of up to 5% of global annual turnover (1 million Euros for smaller companies). We discussed how this regulation will present particular challenges for collection, storage, and use of data within EU and global organizations.  Impact will be felt by data scientists in particular but also across the IT organization.   In this post, we focus on the impact on data science and analytic applications and suggest steps to take in the immediate to near future to prevent fines and/or crippling data blackouts.


Direct Impact on Data Scientists

The GDPR emphasizes the individual’s rights to understand and control how their data are used. The impact of the GDPR for data scientists includes:

1. Ability to collect data. There will be an increased legislation of principles of Privacy by Design/Privacy by Default, which minimize the baseline collection level of data thru systems and processes (think, for example, of browser default settings). Individuals will need to give express consent for what data are collected and will need to be informed as to why the data are being collected.  

2. Ability to use data. It will become necessary to get express consent for each application of personal data. (Details here are still under debate, and there will likely be certain exceptions).   This could severely impact the ability of data scientists to find new applications for existing data, as those applications will not have been listed in original consent forms. What’s important to note is that there will likely be a grandfathering of current consent.   Thus, it is extremely important to assure that proper consent is in place now.

3. Ability to transfer data to and from third parties. Stiff regulatory fines will certainly produce an environment where corporations are very reluctant to buy, sell or share data that may be personal. In addition, right to privacy/erasure regulation may have strong implications on data sharing (details are still under discussion in the EU parliament).   As a result, expect a drying up of certain data sources.

4. Customer Profiling will be specifically affected by the new regulations. In particular, the customer must be informed when and how data will be used to profile them with material impact (e.g. credit scoring, fraud detection, etc.).  In addition, they must have the right to opt-out of automatic profiling algorithms (which will produce additional bias that must be dealt with in the model calibration).  Finally, and significantly, companies can be held in violation if their profiling algorithms are not sufficiently robust.

5. Requirements in storing data. There are some significant issues here.

  • Individuals will be guaranteed the right to be forgotten/right to erasure. Thus, companies will need to know the location of all copies and destinations of any data that may be tied to an individual.
  • The GDPR will require not only compliance but also accountability, meaning that corporations must be prepared to demonstrate to the supervisors that they are compliant.     This will require extensive preparation,undoubtedly including an extensive up-front data audit, documenting the location, type and accessibility of the three types of customer data (volunteered, observed, and inferred by data science techniques).
  • Data scientists will need to be aware of the implications of passing personal data through service providers, including Cloud Storage, Cloud based BI and analytics tools, and web services.

6.  Much heavier emphasis on privacy in your company. A few factors will be at play here.  

  • The June draft proposed a fine of up to 2% of global annual turnover for violations of the regulation. This is already massive. The EU parliament subsequently proposed that this fine be increased to 5%. We are waiting to see the final figure, but, regardless, we can be sure that companies must and will make compliance with the GDPR a top priority.
  • Larger companies in the EU will need to appoint a Data Protection Officer.  Expect to get to know this person quite well over the coming years.
  • As mentioned above, the GDPR will impose accountability, not just compliance. This means that substantial effort will need to go into producing documentation to present to the supervisor on demand, demonstrating that your company is in full control of personal data and is in compliance with the GDPR. 


Start Preparing Now

The GDPR is so significant that corporations are already beginning to prepare for its implementation.     Compliance involves steps that cannot be taking overnight, and the accountability clause will require a documented awareness of data assets and systems, most likely including some type of data audit and risk assessment. 

I recommend beginning now with the following steps:

1. Audit your entire data ecosystem now, and determine how it may expose you to privacy violations. Start with the structured data in your BI systems. Look at the dark data in your operational systems. Look at yo

ur Big Data, including the web log data and any sensor data.   Document what is there, to where it is replicated, who has access, and what controls are in place.   Document what data are personal and what may be made personal through various data science techniques.   You’ll most likely need to do this audit within the next year or two anyway, so it’s best to do this now and already introduce necessary changes to product roadmaps.

2. Ensure that user consent is properly implemented before the GDPR takes effect.   The reason this is so key is that the current status of the GDPR allows user consent to be grandfathered in.   Your ability to use any data that you have collected may be severely limited under the GDPR if you do not have proper user consent.

3. Ensure that all product roadmaps comply with the principles of Privacy by Design.   If you aren’t already familiar with the concepts of privacy by design/privacy by default, then become familiar. Communicate with product owners so that products developed in the future maintain full functionality while still complying with the restrictions on data collection required by the GDPR.   Design these products so that business critical data can be collected in a way that honors privacy laws while still enabling the business to be data driven to the fullest possible extent.

4. Initiate dialogue with your corporate privacy officer or external expert. The stakes have become quite high, and the subject matter is complex.   There will need to be strong 2-way communication between legal and technical experts, and that communication should start very soon.

Views: 687

Tags: data, privacy, protection, science


You need to be a member of Data Science Central to add comments!

Join Data Science Central

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service