Global Technology Strategist of Unilever. Unlocking the Power of Data and Analytics via Cloud enabled Smart Digitization of business process
Many have emphasized the need for data for artificial intelligence (AI) and machine learning (ML) algorithms, and metaphors from “data is the new oil” to “data is the new sun” further exacerbate the dire need for better data. However, one aspect of data that is often not explicitly mentioned in these circumstances is the role of master data and how it fundamentally impacts the quality of data that is driving the ML algorithms.
In the spirit of paying tribute to management guru Peter Drucker, who’s credited with the saying, “culture eats strategy for breakfast,” this article explores:
• Why there is a lack of awareness in enterprises about master data.
• How trusted and clean master data is fundamental for AI/ML algorithms.
• What enterprises should be doing about it to get the most out of their AI/ML investments.
What is master data?
According to The DAMA Guide to the Data Management Body of Knowledge, master data represents “data about the business entities that provide context for business transactions.”
Simply put, for any enterprise, it is the customers whom they sell to, the brands they market, the products they sell, the consumers who use their products, the materials used to make the products, the plants that manufacture their products, the suppliers that supply the materials, the employees who build the products directly or indirectly, and the list goes on.
Why is there a lack of awareness in enterprises about master data?
Traditionally, in most large organizations, master data is embedded within the core systems that run the company like enterprise resource planning (ERP) systems. They dictate which data is needed to run the business. The management and maintenance are often delegated to operations teams with complex processes that never keep pace with changing business models. Most business decision-makers don’t even know about the existence of “master data” in their organization until a major business risk materializes and/or there’s a rise in customer complaints and they take on significant losses or incur considerable costs to recover.
Here are a few examples:
• A large multinational customer going out of business, and the enterprise not even knowing the list of legal entity names of the customer around the world to recover any outstanding dues.
• A product incident rising from a claim on the package and being unable to trace back the claim itself to provide transparency.
• Payment to the supplier’s wrong bank account — or worse, the supplier is from an embargoed country.
• A subscription service having expired credit card details and being unable to renew subscriptions.
• Airlines don’t auto-populate their customers’ miles account and customers miss out on the miles credit.
• In the event of a data breach, a company being unable to notify customers of the extent of the breach in a timely manner.
• Most commonly, invoices sent to incorrect email addresses (or, in the past, printed invoices sent to the wrong postal addresses) and items shipped to the wrong locations.
These examples lead to significant revenue loss or an increase in operational expenses — or both. When any of these happens, we call this an inflection point to drive the awareness of the importance of master data and push CFOs to make the right investments to ensure the proper processes, operating models, operational organization and technology are in place.
Recognizing that trusted and clean master data is fundamental for ML algorithms.
Master data plays a significant role in linking various operational data in any enterprise. The narrative above in the definition section highlights how each of the master data entities is interlinked. If those connections and interlinks break due to missing information or — even worse — wrong information, the catastrophes happen. During the past 30 years of enterprise automation, humans still played a significant role in managing and maintaining master data. They can limit the damages caused by master data with the right intervention at the right time.
However, ML requires precision in data, and its training sets need to have the right connections and links to ensure higher-precision prediction. Missing connections and wrong links will lower the predictions. You can imagine the damage fully automated ML-driven operations can cause due to wrong connections and missing links.
Not addressing master data issues ahead of any heavy investments in AI/ML can result in the significant rewiring of data in the ML training sets later and render any previous insights useless.
What should enterprises be doing to get the most out of their ML investments?
In parallel to making any momentous investment in the data and analytics space, enterprises must audit how unsullied their master data is, focusing on the most critical entities like customers, products, materials, suppliers and employees.
Most often, enterprises recognize this as an IT issue and bring in new master data management technologies. However, bringing in new technologies alone is not going to solve the problem. Enterprises should revamp their processes and controls around onboarding new customers and suppliers as examples. They should ensure they have properly integrated operations and operating models and provide connections and links between entities like customers, suppliers and products that are maintained and supported by automation and technology with little human intervention. Wherever possible, bring in authoritative external sources to validate the data entering the ecosystem. Move toward a self-service model with the right controls in place so that customers and suppliers can manage the data by themselves — for example, their contact details and bank details.
As you invest in cloud-scale data lakes and begin leveraging them to implement predictive algorithms to drive business operations and optimizations, think about the data that provides the context of that data to your business. Think about how clean it is and how efficiently it is sourced, managed and maintained. As you discover how bad the master data is, its process, legacy technology or inefficient operations, ensure you equally invest in bringing master data quality and processes to the same state (if not better) of your operational data. Otherwise, your ML predictions are as good as its weakest link, your master data.
Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?