In doing the research for our six-monthly update of the MDM market I came across something which surprised me. Generally master data can be complex (such as bill of materials structures) but is not generally very large in volume, at least compared to transaction volumes. The exception is the “customer” dimension in a B2C company, where it is easy to see that 50 million or so records may be needed. Luckily “customer” data is usually quite simple compared to, say, product data, which may have hundreds of attributes.
However I have come across three cases now where the volume of master data records being managed is claimed to be around 500 million records. One vendor I spoke to said they had a customer planning a billion record MDM system. Dealing with hundreds of millions of records rather than tens of millions is a lot more challenging, especially where the data need to be dealt with in real time e.g. if you are adding a new customer account then you need to check whether that apparently new customer account is really a duplicate of an existing account; this should ideally be done straight away.
If anyone reading this has come across one of these really large MDM implementations then I’d be interested to hear your experiences.

1 comment so far
Hi Andy,
a lot of considerations could be made on your post. I will try to be short and clear.
A project on MDM with over tens of millions customers, as you said, is a lot more challenging then a normal MDM project. Hoverer, some technical features permit to manage huge data in MDM without problems (caching, etc.). Hardware is important, for example having a huge RAM, processor, etc…. permits MDM to find data more quickly, There are many new hardware solutions to increase performance. It\’s clear that performance testing will be necessary for activities like matching of customers, merging, etc.
Hope this help you.
Regards,
Vito Palasciano
Your e-mail address is for administration purposes and is never displayed.