Subscribe to DSC Newsletter

The Data Supply Chain and Master Data Management

The recent TDWI Keynote by Evan Levy focused on traditional versus new strategies for managing data. The traditional way is based on an organization creating, storing, analyzing and distributing data internally. Most modern (last 10 years) data warehouse / business intelligence platforms are designed on this model and can manage and track where data is created and consumed.

.

The new strategy for managing data includes: both internal and external data; structured and unstructured data; external data analytical applications; external data providers; and both internal and external data scientists. As a result, an organization needs fast and easy access to big volumes of data and understand how it moves, transforms and migrates - inside and outside the organization. Modern data warehouse / business intelligence platforms are unable to do this job.
.
The past 10 to 15 years has seen a shift from custom-built to packaged applications to automate knowledge / business processes. The design flaw is that custom code and middleware is required to move all this data between packaged systems. The brain damage, money and time spent on data migration solutions - in addition to the human capital needed to clean the data - is huge and wasteful. Current ETL tools are primitive and while they save time and reduce custom coding they are not a long term solution. Moreover, this design will not work with the new volume, variety and velocity of large internal and external data sets.
.
Levy offers a partial solution: the data supply chain. The data supply chain concept was pioneered by Walmart years ago and seeks to broaden the traditional corporate information life cycle to include the numerous data sourcing, provisioning and logistical activities required to manage data. Walmart understood the design flaw in having a separate custom distribution system. The solution was a standard distribution system where standardization occurs at the source. 
.
Simply, the data supply chain is all about standardization of data. Focus on designing and building one standardized data supply chain instead of custom distribution systems for each business application. Eliminate middleware, ETL and writing massive amounts of custom code to standardize, clean and integrate data.
.

Yet standardizing data at the source is only part of the solution. The other part is Master Data Management (MDM).

.

MDM standardizes data enabling better data governance to capture and enforce clean and reliable data for optimal data science and business analytics. Standardized values and definitions allow uniform understanding of data stored in various data warehouses so users can find and access the data they need easily and fast. 
.
MDM comprises a set of processes and tools that defines and manages data. Quality of data shapes decision making and MDM helps leverage trusted information to make better decisions, increase profitability and reduce risk. 
.
Master data is reference data about: people (customers, employees, suppliers), things (products, assets, ledgers) and places (countries, cities, locations). The applications and technologies used to create and maintain master data are part of a MDM system. Virtual master data management (Virtual MDM) utilizes data virtualization and a persistent metadata server to implement a multi-level automated MDM hierarchy.
.
Benefits include:
.
  • Improving business agility
  • Providing a single trusted view of people, processes and applications
  • Allowing strategic decision making
  • Enhancing customer relationships
  • Reducing operational costs
  • Increasing compliance with regulatory requirements

.

MDM help organizations handle four key issues:
.
  • Data redundancy
  • Data inconsistency
  • Business inefficiency
  • Supporting business change

.

One of the main objectives is to publish an integrated, accurate and consistent set of master data for use by other applications and users. This integrated set of master data is called the master data system of record (SOR). The SOR is the gold copy for any given piece of master data, and is the single place in an organization that master data is guaranteed to be accurate and up to date. 
.
Although the MDM system publishes the master data SOR for use by the rest of the IT environment, it is not necessarily the system where master is created and maintained. The system responsible for maintaining any given piece of master data is called the system of entry (SOE). In most organizations today, master data is maintained by multiple SOEs. 
.
Customer data is an example. A company may have customer master data that is maintained by multiple Web store fronts, by the retail organization and by the shipping and billing systems. Creating a single SOR for customer data in such an environment is challenging. 
.
The long term goal of MDM is to solve this problem by creating an MDM system that is not only the SOR for any given type of master data, but also the SOE as well. In other words, standardize data at the source.
.
MDM then can be defined as a set of policies, procedures, applications and technologies for harmonizing and managing the system of record and systems of entry for the data and metadata associated with the key business entities of an organization. 
.

Views: 8923

Tags: Chain, Data, MDM, Management, Master, Supply

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service