A cartoon making its way around social media asks the provocative question “Who wants clean data?” (Everyone raises their hands) and then asks, “Who wants to CLEAN the data?” (Nobody raises their hands). I took the cartoon one step further (apology for my artistic skills) and asked, “Who wants to PAY for clean data?” and shows everyone running for the exits (Figure 1).
Figure 1: Today’s Data Management Reality
Why does everyone run for the exits when asked to pay for data quality, data governance, and data management? Because we do a poor job of connecting high-quality, complete, enriched, granular, low-latency data to the sources of business and operational value creation.
Data is considered the world’s most valuable resource and providing compelling financial results to organizations focused on exploiting the economics of data and analytics (Figure 2).
Figure 2: Industry-leading Data Monetization Organizations
Yet, most business executives are still reluctant to embrace the fundamental necessity of Data Management and fund it accordingly. If data is the catalyst for the economic growth of the 20th century, then it’s time we reframe how we view data management. It’s time to talk about Data Management 2.0.
Data Management 2.0
The Data Management Association (DAMA) has long been the data management champion. DAMA defines data management as “the planning, oversight, and control over the management and use of data and data-related sources”. DAMA is instrumental in driving data management development of procedures, practices, policies, and architecture (Figure 3).
Figure 3: DAMA Data Management Framework visualized by Denise Harders
The DAMA Data Management Framework is great for organizations seeking to understand how to manage their data. However, if data is “the world’s most valuable resource”, then we must re-invent data management into a business strategy. We must help organizations understand how best to monetize or derive value from the application of data to their business (Figure 4).
Figure 4: Transforming Data Management
Before exploring the Laws of Data Management 2.0, let me define “Data Monetization”:
Data Monetization is the application of data to the business to drive quantifiable financial value.
While some organizations can sell their data, for the majority of organizations data monetization (or insights monetization) is about the application of the data to the organization’s top use cases to drive quantifiable financial value. Or as Doug Laney, author of the seminal book “Infonomics: How to Monetize, Manage, and Measure Information as an …” stated:
“If you are not quantifying the financial value that your organization derives from the use of data, then you are not doing data monetization”
Laws of Data Management 2.0
Law #1: Data is of no value in of itself
Data possesses potential value, but in of itself, provides zero realized value. As I discussed in “Introducing the 4 Stages of Data Monetization”, data in Stage 1 is a cost to be minimized. Data in Stage 1 is burdened with the increasing costs associated with the storage, management, protection, and governance of the data, as well as potential regulatory and compliance costs, liabilities, and fines associated with not properly managing or protecting one’s data (Figure 5).
Figure 5: 4 Stages of Data Monetization
Data Management 2.0 provides a more holistic methodology that doesn’t just stop at managing data but enables the application of data to the organization’s most important use cases to drive quantifiable financial value.
Law #2: Not all data is off equal value
Many data management organizations waste precious resources (and business stakeholder street cred) by treating all the data the same way. Fact: some data is more important that other data in helping to predict and optimize customer engagement, product performance, and business operations.
To determine which data elements are most important, Data Scientists can apply analytic techniques like Principal Component Analysis (PCA) and Random Forest to quantify the importance of a particular data element (or feature, something that I’ll discuss in my next blog) in optimizing the organization’s key use cases such as customer attrition, predictive product maintenance, unplanned operational downtime, improved healthcare results, or surviving the sinking of the Titanic (Figure 6).
Figure 6: Factors Predicting Titanic Sinking Survival
Data Management 2.0 operationalizes business stakeholder collaborate to identify, validate, value, and prioritize the use cases that deliver organizational value, and identify and triage the KPIs and metrics against which value delivery will be measured.
Law #3: One cannot ascertain the value of their data in isolation of the business
To identify which data variables are most important to the business, data management must start by understanding how the organization creates and measures value creation. This conversation starts with an organization’s business and operational intent; that is, what is the organization trying to accomplish from a business and operational perspective over the next 12 to 18 months, and what are the measures or KPIs against which progress and success will be measured.
Data Management 2.0 reframes how organization’s approach the application of data to the business by understanding how organizations create value (and where and how data can help create value) instead of starting with data (and hoping that data finds its way to value). For more on how to do that, check out my book “The Art of Thinking Like a Data Scientist” which provides an 8-step, collaborative process for engaging the business stakeholders in identifying, validating, valuing, and prioritizing the organization’s most important business and operational use cases (Figure 7).
Figure 7: The Art of Thinking Like a Data Scientist
Law #4: Turning everyone into Data Engineers is not practical and not scalable
Finally, asking business stakeholders to manage their own data sources is impractical and dangerous. It opens the door to random, orphaned data management processes that may address the data and analytic tactical needs, but at the expense of data and analytics’ strategic, economic value.
Data Management 2.0 empowers the entire organization with the capabilities for building, sharing, and refining the organization’s data and analytics capabilities and assets that enables organizations to unleash the business or economic value of their data.
Reframing the Data Management Conversation Summary
If we believe that data is the new oil – that data will be the catalyst for the economic growth in the 21st century – then we need to spend less time and investments trying to manage data and dramatically increase the time and investments to monetize data. That will require organizations to expand their data management capabilities to support the sharing, re-using and continuous refinement of the data and analytics assets to derive and drive new sources of customer, product, and operational value.
Damn it feels good to be a data gangsta!