Some rare common sense

Ad Stam and Mark Allin have written an excellent piece in DM Review this month covering data stewardship and master data management. They correctly point out that, with regards to business intelligence systems, that “change will occur, and time to react will decrease” and lay out a sensible architecture for dealing with this issue. I very much like the way they put emphasis on the need for a business unit to deal with data governance as a key building block. In the article they explain the key requirements of such a group and make the interesting analogy of logistics, which is usually sourced these days to a separate unit or even separate company. Similarly they believe that the management of master data should be managed by a non-IT business unit.

The article also correctly distinguishes between “golden copy” data held in the data warehouse and a master data management repository, which in addition will hold master data in all its stages. The master data repository should be linked to the data warehouse, but are not the same physical entity since the master data repository has to handle “unclean” data whereas the data warehouse should have only fully validates data stored in it.

It is a pleasant change to read such a sensible article on best practice in data management, but this is because Ad and Mark are real practitioners in serious enterprise-wide projects through their work at Atos Origin e.g. at customers like Philips. They are not people who spend their lives giving slick Powerpoint presentations at conferences but are close to the action in real-world implementations. I worry that there are too many people on the conference circuit who are eloquent speakers but haven’t actually seen a real live project for a long time. I have known Ad Stam for many years and can testify that his team at Atos are an extremely competent and experienced set of practitioners who focus on customer delivery rather than self-publicity. If you have a data warehouse or MDM project then you could do a lot worse than use Ad’s team.

2 thoughts on “Some rare common sense”

  1. You raise an important distinction. You should NEVER lose track of valid data that has simply become out of date e.g. when a reorganisation happens you want to expire the old relationship and add a new one. Ideally all relationships should be time-stamped in thsi way, allowing you to recreate past history (this is the way that Kalido does it, for example). There is no reason why this cannot be achieved in custom warehouses also.

    What I meant was data that was not yet validated as being correct i.e. has yet to be loaded into the warehouse. In the Kalido philosophy at least, data should be put into a staging area prior to loading, and then validated against the business rules of the warehouse. Genuinely bad data should never be entered into the warehouse or it will destroy its integrity and credibility with the business users. If they see a report where the numbers add up wrongly then they will be reluctant to trust the warehouse again. This “incomplete” or “unvalidated” data does have a place in a master data repository though, as improving it to “golden copy” quality is part of the master data management process. Only when master data is validated should it be fed into the warehouse.

  2. Andy, when you talk about the DW only having validating data, how do you get a complete view of data in the DW if you drop off unclean data? For example what happens when a key item such as a customer is unclean and keeping that out of the DW loses all history, transactions and revenue from that customer.

    I have been on some DW projects where the data had to go through. Incomplete records had default or dummy values set. Missing referential integrity meant augmenting key tables with dummy records. The users wanted all the measures to be present in the reports and that meant loading unclean data to support those measures.

Comments are closed.