Information catalogs and business glossaries are popular solutions in the data management toolbox. What is the purpose of each and how do they work to effectively manage an organization’s data? Which one should you choose as part of your data management strategy?
What’s in a name?
In the world of data management, business glossaries and information catalogs are sometimes discussed as similar entities and considered interchangeable. However, they are distinct tools with unique purposes and relevant functionalities across the enterprise.
A business glossary is a cornerstone of a successful data governance program and the foundation for building trust in the data. The goal of a business glossary is to provide a single source of truth for data usage with a defined definition for a data element.
The glossary decouples the data from the source system/application to focus its use and identifies the acceptable data values, the responsible party for the data component, and the transformation and remediation procedures being applied to the data. This ensures that the data community will be using correct and reliable data for their analytical needs and leveraging the data that most align with business problems.
On the other hand, the information catalog manages information assets inside of the organization, empowering quick and easy identification of the relevant information needed for a business process, maximizing the value of the organization’s decisioning assets. The goal of the information catalog is to align data with analytics and decisioning functions in a single place for understanding the decisioning lifecycle.
A key component within the information catalog is the data catalog, which curates lists of useful and significant data sources and provides the ability to automatically scan, profile and classify data. The catalog enables exploration of the data through search, discover and browse mechanisms. It also supports the ability for collaboration and gives the data community the ability to rank, like, subscribe and follow users and datasets.
The information catalog also includes other critical information assets such as reports, visualizations, models, decisioning applications and data pipelines.
The power of choice
Which data discipline is more imperative for an organization to deploy as part of its data management strategy - the information catalog or the business glossary?
The answer is both the information catalog and the business glossary together. Simply put, the information catalog understands the decisioning process and the business glossary explains the data being used in the process. Combined, these two approaches provide immense value, providing insights into the data and how it is being used by the organization.
The problem is that most organizations don’t traditionally use the two disciplines together and often overlook the business glossary. For organizational success, conversations about how to use each solution to improve functionality need to be occurring simultaneously and across the organization.
The marriage of the information catalog & business glossary
Once the strategic decision is made to incorporate both the information catalog and business glossary in your data management strategy, how do you use the two disciplines together to create a comprehensive view of the information assets available? Essentially, the convergence of the information catalog and business glossary is an integration issue.
Organizations have very diverse, disparate and complex data environments. Information resides on-premise, in the cloud and a combination of both. This data is structured, unstructured and semi-structured. The database/application owner or IT is responsible for their own data, with most owners overlooking the alignment of data across systems, databases or applications.
Because each system has its own unique data elements, how do you get a digestible understanding of the data that spans across the complex data fabric? And how can the organization can effectively use, manage and align the various data components generated by the expansive data environment?
Enter the business glossary, which focuses on the definitions around the data so there is a common data language being used and shared across the organization. It provides a holistic lexicon around data elements regardless of where the data lives across the data landscape. The glossary propels the linguistic understanding of data and promotes data literacy inside of the organization, offering assurance the data community is using the right data for their analytical and decisioning processes.
By understanding the definitions around the data and aligning these insights into the actual data condition, data consumers are more empowered to choose the right data needed for decisioning functions.
The tie that binds
To enable the convergence of an information catalog and business glossary in your organization, the tie that binds is lineage. Lineage weaves together a comprehensive understanding of where data lives across the organization’s data fabric, how data is being moved, transformed, and used, and ensures the data adheres to governance policies, guidelines and procedures.
Lineage illustrates the relationships between objects from business terms in the business glossary, to data elements, to information and decisioning assets and winds down to people and business processes leveraging these various data and information assets in a single location.
It also supports data governance programs with transparency into the condition, reconditioning and usage of data across the enterprise. Lineage allows data users to understand the decisioning lifecycle, the data condition/collections they want to use for analytics and how these insights are being used organization wide.
In summary, the common language from business glossaries, spurred by a sustainable data governance program and supported by information catalogs and lineage, delivers transparency for trusted and reliable data for analytics, which in turn drives solid and informed decisions for your organizational success.