A few days ago I was reading the proposed Florida Data Privacy law (since greatly amended) and was somewhat surprised to see it had two separate definitions of “Personal Information”. It would be expected that different data privacy laws might well have different definitions of “Personal Information”, but for a single data privacy law to contain two distinct definitions seems odd.
In the case of the Florida law, one definition of “Personal Information” applied to data breaches, and the second definition applied to data privacy in general.
This is just one example of one term having many different definitions. Yet it matters because the general expectation among data professionals seems to be that one term should always have one and only one definition. Failure to understand why this issue arises, and to deal with it, negatively impacts how Business Glossaries are implemented, reducing their usefulness to the business.
It’s Just Academic and Not Important
An immediate response to this issue is that it is not important. This may in part be due to a view that there is nothing to automate here, and it is “just a people thing”. It is also difficult to shake off the prejudice of “one term, one definition”. However, reality is otherwise.
Metrics are a very important example where a Business Term provides a label for a piece of information that is used in key business decisions. Yet the many stakeholders involved with the metric – both those who gather the information and those who use it – often have divergent views on what it means. That is, there is one Business Term, but many definitions in practice. For example, “Cost of Goods Sold” can be calculated in different ways in different parts of the enterprise, all which get aggregated into a meaningless figure. With metrics a lot of things matter: the meaning; the methodology by which data is collected; timing concerns; estimated values; and calculations. All these should be documented clearly in a Business Glossary entry for the metric. But very often there are lots of arguments about metric definitions from stakeholders. This is especially true for metrics around compensation for performance like those used to calculate commissions. Metrics are among the most important Business Terms to get right in a Business Glossary, and are particularly prone to the problem of many definitions – having a direct impact on the business.
The Issue of Context
It is inevitable that we are going to have important Business Terms that have more than one definition. If we conceive of a Business Glossary as a great big list of Business Terms in alphabetical order we are therefore going to get duplicates, or will have to find a way to include several definitions within a single entry for a Business Term.
This great big alphabetical list is probably the most horrible way to implement a Business Glossary that could be thought of. It emulates printed dictionaries that serve the purpose of helping someone find what a word actually means, constrained by the need to reduce printing costs. Printed dictionaries have short definitions that give just enough information to understand a word in the shortest possible space. A true Business Glossary should have all facts of business significance in an entry for a Business Term, and does not need to worry about printing costs.
Most importantly, a Business Glossary should deal with contexts. “Context” is a poor word for what is needed here, as its original definition involves understanding how a term in used text by looking at the text that surrounds the term. We are not trying to interpret terms in individual documents, but we do want to understand the concepts that lie behind them in each of the cultures and subcultures of the business. Each of these cultures and subcultures has ontological models – a set of business views of pure information (not data) – that the business terms used in the culture or subculture has to fit into. Our Business Glossaries should likewise consist of sets of these ontological models. They should not be flat ordered lists of Business Terms arranged in alphabetical order.
Will It Ever Happen?
At the moment, software engineers producing data catalogs seem to have very little interest in Business Glossaries. Perhaps because they think there is very little to be automated, unlike, say automatically harvesting data lineage. Or perhaps it is because the needs of the Business Glossary seem to be more connected to people and methodologies than anything that needs complex software functionality. With Data Catalogs very much driving the data governance profession at the moment, this is unfortunate.
What we should have is a collection of ontological models within each of which Business Terms are shown in their relationships. Each model would be specific enough that every Business Term in it would have only one definition, and each model would be tied to the part of the business value chain that it supports.
Unfortunately, we seem to be a long way off from this.Report this
Malcolm ChisholmData Management and Data Governance Consultant, Speaker, Advisor, Writer, Beekeeper and Coffee FarmerPublished • Business Glossaries are increasingly becoming the Cinderella of metadata tools. In this article I deal with an often overlooked but vitally important part of the methodology needed for the successful operation of Business Glossary – one term, many definitions. Do you think this is important, or is your organization a committed “one term, one definition” shop?