In doing the research for our six-monthly update of the MDM market I came across something which surprised me. Generally master data can be complex (such as bill of materials structures) but is not generally very large in volume, at least compared to transaction volumes. The exception is the “customer” dimension in a B2C company, where it is easy to see that 50 million or so records may be needed. Luckily “customer” data is usually quite simple compared to, say, product data, which may have hundreds of attributes.
However I have come across three cases now where the volume of master data records being managed is claimed to be around 500 million records. One vendor I spoke to said they had a customer planning a billion record MDM system. Dealing with hundreds of millions of records rather than tens of millions is a lot more challenging, especially where the data need to be dealt with in real time e.g. if you are adding a new customer account then you need to check whether that apparently new customer account is really a duplicate of an existing account; this should ideally be done straight away.
If anyone reading this has come across one of these really large MDM implementations then I’d be interested to hear your experiences.