

In recent years I’ve added a new item (most certainly not unique to me) to that humorous old list that begins “You know you’re getting old when…” It’s when you see a concept or discipline you’ve been working with for decades re-named and repackaged as something new for the nth time (where n must be greater than 2). So it is with Master Data Management.
For those of us who have been worrying about enterprise data models, single sources of truth, propagating data from authoritative sources to other platforms that need it and the like for 20 years or more, the core principles of MDM are nothing new. What has changed in the intervening years, of course, are the pervasiveness of fast network connections employing standard protocols (the Internet and its intranet cousins), and the emergence of non-proprietary, standard methods of packaging and moving data between systems (XML and HTTP). In my mind, latter-day MDM and SOA are intricately linked. It is exciting for those of us who started out in the asynchronous and proprietary world to see MDM take off leveraging these newer capabilities.
That said, MDM seems daunting to many organizations. Complicating the perception is the fact that vendors are busily hawking their competing technologies, which are often data management tools re-branded as MDM solutions with promises of a panacea, and come with a stiff price tag for both licensing and implementation. Some data management practitioners feel compelled to develop an overarching MDM strategy and embark on a formal MDM “Program.” The problem is that MDM quickly takes on the feel of a Really Big Project which, like all RBPs, becomes immediately encircled by a penumbra of imminent failure.
I would advocate an incremental approach using tools and disciplines already available to us. After all, the core principles, as I’ve already said, are not new. In fact, I would even advocate starting to inculcate MDM methods without calling them that, lest IT leadership get a whiff of that “RBP” coming on. First demonstrate the power of the concept in concrete application with business value, then you have a foundation on which to pull more data management initiatives under the MDM umbrella. Call it the stealth implementation of MDM. A couple of real-life examples will illustrate the premise.
At the analytics-intensive company where I work we had lots of data analysts in the business who kept reference code sets of various sorts to support their reports and analyses – for example, to aggregate granular source-system codes into more useful hierarchical groupings. The tables were maintained in Excel spreadsheets, SAS files, desktop Access databases, and all the other usual suspects. As we were developing an Enterprise Data Warehouse we were also beginning to implement a Master Data strategy (but not calling it that!), and needed a central, controlled source for all of these end user-maintained reference sets. So we established the Corporate Reference Center schema on our operational database platform, and built a neat little utility program through which end users who are the appointed data stewards can maintain their reference data. The app also validates the data against dependent tables and business rules (cf. Biderman and McLean, “A Semantic Driven Application for Master Data Management” DAMA International + Wilshire Meta Data Conference, San Diego, March 2008). This established the place and mechanism for enterprise reference data to be housed.
The initial instruction to the IT development community was that all applications that needed data from these tables should query them in situ or move them by an ETL process as required to where they needed to be. In the next phase we started to build services around some of these data stores to create a platform-neutral abstraction layer through which they could be accessed. Here is a simple but compelling example:
The Massachusetts Health Care Reform Act of 2006 required health insurance carriers in the state to issue a document to their subscribers informing them whether their policy meets the Commonwealth’s standards for “minimum creditable coverage” (in industry speak). So we crafted a reference table in which the codes that represent health plan components get tagged with values indicating compliance and, for those that aren’t compliant, link to explanations of why that is so. It turned out the data was needed in three different places: on a mainframe from which the annual documents would be generated; on a SQL Server-based system used to mail fulfillment materials to subscribers; and in our Oracle-based CRM for generating correct correspondence to employers. Solution: The data is maintained in the Corporate Reference Center, and the platforms requiring the data acquire it via our service broker. As it happens, the fulfillment system preferred to have a replica of the data, so a Publish-Subscribe service worked best there, while the CRM’s use was on a one-off basis so it was happy to interrogate a Request-Reply service.
One further, important point: these tables, like all tables in the Corporate Reference Center, have a designated Data Steward, who is responsible for ensuring all product codes are accurately attributed. Thus our Data Stewardship strategy is being implemented incrementally along with our MDM strategy.
In sum, by corralling shareable data sets into a single location, then building service wrappers around them to standardize the interface with consuming applications, we have begun implementing Master Data Management rather than just fretting about the enormity of the challenge. And we are about at the point where we can pull the shroud off our stealth strategy and reveal it to be MDM without any IT executive passing out… because we’re already doing it.