Subscribe to DSC Newsletter

Why Protecting Data Privacy Matters, and When

(A Wake-Up Call to Data Geeks Who Doubt)

by Anne Russell

Anne is the Managing Partner and Founder of World Data Insights, a data consulting company helping customers transform data into information that matters. Anne has spent the last decade immersed in multiple aspects of government-funded research and implementation of data driven approaches. To learn more about Anne, click here. To get in touch with Anne, contact her at [email protected]

It’s official. Public concerns over the privacy of data used in digital approaches have reached an apex. Worried about the safety of digital networks, consumers want to gain control over what they increasingly sense as a loss of power over how their data is used. It’s not hard to wonder why. Look at the extent of coverage on the U.S. Government data breach last month and the sheer growth in the number of attacks against government and others overall. Then there is the increasing coverage on the inherent security flaws built into the internet, through which most of our data flows. The costs of data breaches to individuals, industries, and government are adding up. And users are taking note.

Regulation is Coming

In response, policy makers are actively meeting with industry lobbyists and privacy advocates to develop data governance policies and regulations that align existing information privacy and security laws with the new reality of data driven technologies. As negotiations continue, we will undoubtedly hear more about points of contention and debate from differing sides. Industry lobbyists will defend the rights of developers to innovate and the need for free, unregulated space to do so. And privacy advocates will defend the rights of users to be legally protected from the unauthorized and unwanted incursions that unregulated approaches can make into a user’s personal, private space.

At this juncture, whether one or the other of these sides is “right” is irrelevant. What matters is that consumers are not happy and policy makers are listening. While the existing laws may not (yet) specifically address every technology based data driven approach currently in existence, the “Wild West” atmosphere in which the majority of data approaches have been developed to date is ending. Data driven approaches will be regulated, likely sooner than later.

In this new reality, developers no longer have the luxury of treating data privacy or security as secondary issues to worry about tomorrow, after they’ve worked out their core technology and approach. Now, they must consider whether their plans include data that can trigger privacy or security concerns and modify their approach accordingly. And they must prioritize privacy and build in measures to secure data at every stage of their process. This is true whether they are building approaches for themselves, for research purposes, or in the hopes of becoming the next entrepreneurial unicorn.

Where Data Privacy and Security Matters

If you’re not sure whether the data fueling your approach will raise privacy and security flags, consider the following. When it comes to data privacy and security, not all data is going to be of equal concern. Much depends on the level of detail in data content, data type, data structure, volume, and velocity, and indeed how the data itself will be used and released.

First there is the data where security and privacy has always mattered and for which there is already an existing and well galvanized body of law in place. Foremost among these is classified or national security data where data usage is highly regulated and enforced. Other data for which there exists a considerable body of international and national law regulating usage includes:

  • Proprietary Data – specifically the data that makes up the intellectual capital of individual businesses and gives them their competitive economic advantage over others, including data protected under copyright, patent, or trade secret laws and the sensitive, protected data that companies collect on behalf of its customers;
  • Infrastructure Data - data from the physical facilities and systems – such as roads, electrical systems, communications services, etc. – that enable local, regional, national, and international economic activity; and
  • Controlled Technical Data - technical, biological, chemical, and military-related data and research that could be considered of national interest and be under foreign export restrictions.


It may be possible to work with publicly released annualized and cleansed data within these areas without a problem, but the majority of granular data from which significant insight can be gleaned is protected.  In most instances, scientists, researchers, and other authorized developers take years to appropriately acquire the expertise, build the personal relationships, and construct the technical, procedural and legal infrastructure to work with the granular data before implementing any approach. Even using publicly released datasets within these areas can be restricted, requiring either registration, the recognition of or affiliation to an appropriate data governing body, background checks, or all three before authorization is granted.

The second group of data that raises privacy and security concerns is personal data. Commonly referred to as Personally Identifiable Information (PII), it is any data that distinguishes individuals from each other. It is also the data that an increasing number of digital approaches rely on, and the data whose use tends to raise the most public ire. Personal data could include but is not limited to an individual’s:

  • Government issued record data (social security numbers, national or state identity numbers, passport records, vehicle data, voting records, etc.);
  • Law enforcement data (criminal records, legal proceedings, etc.);
  • Personal financial, employment, medical, and education data;
  • Communication records (phone numbers, texts data, message records, content of conversations, time and location, etc.);
  • Travel data (when and where traveling, carriers used, etc.);
  • Networks and memberships (family, friends, interests, group affiliations, etc.);
  • Location data (where a person is and when);
  • Basic contact information (name, address, e-mail, telephone, fax, twitter handles, etc.);
  • Internet data (search histories, website visits, click rates, likes, site forwards, comments, etc.);
  • Media data (which shows you’re watching, music you’re listening to, books or magazines you’re reading, etc.);
  • Transaction data (what you’re buying or selling, who you’re doing business with, where, etc.); and
  • Bio and activity data (from personal mobile and wearable devices).


In industries where being responsible for handling highly detailed personal data is the established business norm – such as in the education, medical and financial fields – there are already government regulations, business practices and data privacy and security laws that protect data from unauthorized usage, including across new digital platforms. But in many other industries, particularly in data driven industries where personal data has been treated as proprietary data and become the foundation of business models, there is currently little to no regulation.

Regulation protecting privacy was already an issue before the digitization of data, when the largely manual process of collecting data through industry databases, surveys, phone books, and public records resulted in junk mail, telemarketing, and annoying robocalls. But now that businesses can automatically collect passive and active personal data via consumer behaviors and multiple physical sensors, consumers have become significantly more alarmed at how  customization in services and products can lead to the loss of personal privacy. Take for example the growth in voices raised over the use of data from Wearables or from sensors related to the interconnected “Internet of Things”. In the new normal, the more that a data approach depends on data actively or passively collected on individuals, the more likely that consumers will speak up and demand privacy protection, even if they previously gave some form of tacit approval to use their data.

Despite this new landscape, there are lots of different ways to use personal data, some of which may not trigger significant privacy or security concerns. This is particularly true in cases where individuals willingly provide their data or data cannot be attributed to an individual. Whether individuals remain neutral to data approaches tends to be related to the level of control they feel they have over how their personal data is used. Some organizations that collect personal data extensively, such as Facebook and Google, work to increasingly provide their users with methods to control their own data. But for others, the lack of due diligence on data privacy in their approaches has already had their effect.

A third category of data needing privacy consideration is the data related to good people working in difficult or dangerous places. Activists, journalists, politicians, whistle-blowers, business owners, and others working in contentious areas and conflict zones need secure means to communicate and share data without fear of retribution and personal harm.  That there are parts of the world where individuals can be in mortal danger for speaking out is one of the reason that TOR (The Onion Router) has received substantial funding from multiple government and philanthropic groups, even at the high risk of enabling anonymized criminal behavior. Indeed, in the absence of alternate secure networks on which to pass data, many would be in grave danger, including those such as the organizers of the Arab Spring in 2010 as well as dissidents in Syria and elsewhere.

Yet it is easy for well-meaning, idealistic developers to implement worldwide services that do not take data privacy and security considerations into account simply because there was no perceived need for them and no laws, regulations or procedures to guide the process. Take for instance the experiences of the teams of volunteers seeking to build on successful efforts to crowdsource and geo-locate humanitarian crises around the globe. Crowdmapping initiatives have been effective in humanitarian contexts, like relief efforts in Nepal, where everyone is willing to share data for the greater good. But when it comes to geo-locating data on conflict events, spikes of alarm over the veracity of data and the safety of those who would provide good data increase profoundly.

The Best Defense is a Good Offense

Bottom Line: if you are at all uneasy about internet insecurity, attacks from criminals, customer security and privacy concerns, the steady rise of regulatory environments, making sure you’re not breaking any laws, or protecting the welfare of others, you need to be proactive about data privacy and security in your approach.  The best plan is to address any issues from the beginning. But even if your approach is already in development, consider taking the following steps to minimize the risk to the data that fuels your approach:

  • Do your research and find out whether others who have implemented similar data driven approaches have encountered any regulatory or security issues;
  • Check with subject matter experts to understand whether there are standard operating procedures for safeguarding similar data in non-technical business processes;
  • Consult with legal and policy experts to check the correlating laws, policies, and regulations that apply at the local, national, or international levels where your approach will be implemented;
  • Consult with data experts who understand different, technical and business aspects of data and who can help you design and implement appropriate data privacy and security policies;
  • Engage with your user base to assure that there is clarity, transparency and appropriate authorization for the way you collect and use their data, protect it from misuse, and release outputs for public consumption;
  • Develop data governance policies and enforce procedures that clarify how data should be stored, maintained and kept secure from ingest to output;
  • Implement and maintain data security measures to assure that any attacks against your system can be rapidly identified and appropriately managed; and
  • Clean your data of all PII and confidential data before it’s released. And then clean it again.


Taking these steps will not guarantee that you are either safe from breaches or from criticism. However, being proactive in your approach to privacy will help to assure you are recognized for implementing responsible data management practices that are clear and transparent, and that you do the utmost to protect the data you govern. 

Views: 605

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service