In doing the research for our six-monthly update of the MDM market I came across something which surprised me. Generally master data can be complex (such as bill of materials structures) but is not generally very large in volume, at least compared to transaction volumes. The exception is the “customer” dimension in a B2C company, where it is easy to see that 50 million or so records may be needed. Luckily “customer” data is usually quite simple compared to, say, product data, which may have hundreds of attributes.
However I have come across three cases now where the volume of master data records being managed is claimed to be around 500 million records. One vendor I spoke to said they had a customer planning a billion record MDM system. Dealing with hundreds of millions of records rather than tens of millions is a lot more challenging, especially where the data need to be dealt with in real time e.g. if you are adding a new customer account then you need to check whether that apparently new customer account is really a duplicate of an existing account; this should ideally be done straight away.
If anyone reading this has come across one of these really large MDM implementations then I’d be interested to hear your experiences.
According to the Celts, today is the day of the year when the boundaries between the living and the dead dissolve. While the ghosts of the departed such as Lehman Brothers stalk the earth, the living put on masks in order to mimic or placate the evil spirits. I’m a little unclear as to what would be the most suitable mask to don to mimic a deceased investment bank (do Armani make masks?), but software vendors across the globe will be nervously hoping that the spirit of Lehman has been thoroughly placated by the kindly sprites of Hank Paulsen, Gordon Brown et al. Fear is stalking the enterprise software market on a scale not seen since the aftermath of the millenium party in 2001.
Ghouls and goblins in the form of financial controllers in large companies are, as we speak, preparing a witches brew of budget cuts sufficient to make the bravest software salesman quail. Companies look at each other nervously and sing around the campfires to keep the spirits up, saying that their particular type of software does such an important job that it won’t be affected, and indeed that in times of adversity, perhaps companies will actually spend more money on critical IT projects? After all, data quality/MDM/(insert sector) is more important than ever now right? Right?
If you believe that you probably also believe in fairies and that derivatives make our financial systems more stable. As sure as night follows day, in times of economic downturn the finance department bring out their trusty red pen and seek out advertising budgets,travel allowances, training and information technology projects to carve up. At present there is a delayed reaction, as IT projects lumber on like zombies, unaware of the carnage waiting ahead. Perhaps our particular sector or project will be unaffected? Perhaps, but few will escape unscathed.
There will shortly be a lot more tricks than treats out there on offer for the enterprise software salesman as he makes his calls this winter.
I noticed an interesting blog post by Andrew Brooks about a possible side effect of the credit crunch, that of an increasing interest in MDM and data quality by recruiters. This may sound paradoxical, but it makes sense. Companies struggle to get good management information (for example about levels of counterparty risk for trading organisations such as investment banks) due to inconsistent master data across multiple systems. When times are booming this may be glossed over, but with prestigious companies going to the wall on a daily basis, being certain of the information that you rely on gets a higher priority. The blog resonated with me since I have just had a couple of recruitment agencies call me in the last few days asking what this “MDM thing is all about” in response to recent client inquiries.
Every major company I talk to struggles with getting reliable enterprise-wide data, and in every project I have been involved with, the data quality in corporate systems is worse than people think it is. If the current tough financial conditions prompt a new focus on fixing these issues, then indeed perhaps there will be a modest silver lining at the end of the credit crunch cloud.
There is an encouraging increase in MDM education, reflecting the growing interest in the subject. As well as Aaron Zornes’ pioneering work at the MDM Institute, we had a dedicated TDWI MDM conference in Savannah earlier in 2008. Now TDWI is running a dedicated MDM track on its next week-long conference, in New Orleans in early November. Just in case you didn’t have enough reason to attend, I will running a half day workshop as part of the conference.
For more information or to download a complete copy of the brochure, visit: www.tdwi.org/neworleans2008
Register before October 4, and receive an early registration discount, if you use this code when booking.
Priority Code: IN34
By early November it is nearing the end of hurricane season, and you can take advantage of New Orlean’s famous easy-going atmosphere and fine music and cuisine (at least after you have finished the workshop sessions). You may even learn something.
Hope to see you there!
Increasingly when I talk to customers about their plans for master data management or data quality, the area of data governance comes up. Data governance is the set of business processes and controls that surround the lifecycle of data, rather than necessarily involving technology. Some MDM vendors provide sophisticated support for data governance, others less so, but I believe that it is key to the success of a data initiative. Without thinking about the processes, it is of limited use to just put in a piece of technology, or carry out a once-off data cleansing exercise without considering what happens next.
Indeed in talking recently with data quality vendors, a number have commented on how the increasing profile that data governance is getting has enabled them to, in turn, get a higher profile. Previously data quality was seen by many companies as strictly something for IT or those direct mail geeks in their marketing department, but now data quality is getting an airing as part of broader initiatives to improve data across the board.
In order to put more flesh on the data governance bone, The Information Difference is conducting a major piece of primary market research into data governance. You can be part of this by taking the survey Completed surveys will receive a full copy of the research, and there is even a prize draw with exciting and valuable prizes (this view may depend a little on your taste) to tempt you.
In a survey of telecoms companies by Yankee Group, it would appear than master data management has made it to the second highest investment prioity amongst the wireless, wireline cable, and satellite companies. This is interesting since it appears to confirm the generally very bullish predictions for MDM as a market, which I wondered whether may be affected by the general economic climate. After all, MDM is not a trivial matter on an enterprise scale, so it could be tempting to defer an MDM project and put it into into the “too difficult” pile for a year or two to see how the economy recovered.
Later in the year the Information Difference will conduct some research amongst MDM vendors to see how the predictions from analyst firms such as Forrester and Yankee are stacking up to the reality of a tricky economy.
Microsoft generally likes to acquire software companies when they are quite small, with a dozen or two employees. In this way they can assimilate the development staff into Redmond and into the Microsoft way of doing things. An example of this was last week, when they decided to acquire a data quality technology. There are literally dozens of data quality vendors out there, most fairly small, and so there was plenty of choice. They opted for Zoomix, a small Israeli company which I first encountered in 2006, though they were founded in 1999. Zoomix had some quite clever marketing, claiming “self learning” technology as a way of making data profiling in particular more productive. In this way it could be compared to Exeros, although the technology underpinnings are quite different.
In this case the R&D team will move into the Microsoft technology centre already in Israel. This is a logical move by Microsoft, who acquired Stratature in order to give them an MDM capability. This product is currently being retooled under the code-name Bulldog, and a data quality offering to complement this is a natural fit. The timing around Bulldog’s release are unclear at this point, as it is folded into the SQL Server release timeframe.
At the Information Difference we continue to add new research on the MDM vendors. One of the things that is useful to know about vendors when drawing up a shortlist is which platforms the various vendors support e.g. which database, which web server, and perhaps more minor but useful technical information about whether they have double byte character support, do they have 7 x 24 helpline support etc.
There doesn’t seem to be a place where this kind of information is gathered together but there is now.
This also has information on the level of SOA support (if any), which non-English languages are supported in the user interface etc.
There are some good points in the DM Review article on MDM by William McKnight. In particular, I like that he clearly highlights some issues that some parts of our industry are still in denial over:
(a) that customer and product data are just a small, though important, part of the master data picture for an enterprise.
(b) in reality, most companies will end up with more than one approach in MDM “style” or architecture, due to sheer complexity of the enterprise application environments out there. I feel that too often IT architects spend time agonising over the most “elegant” approach when in fact the tools are still evolving, and will no doubt go through more than one iteration of architecture over the next decade. Given the savings to be made out there by improved master data management, waiting for a perfect solution may be a costly and missed opportunity.
(c) data quality is a key part of an MDM project. Indeed, as my colleague Dave Waddington said the other day: “if you don’t have any data governance processes, what exactly is the point of doing a data quality project?”. One-off data quality initiatives may have value, but without the processes to fix the disease rather than treat the symptoms, the patient is never going to get truly better.
Although it might seem obvious that “The best place to manage master data is in the operational environment” as it says in the article, I am not sure that in practice this is always right. Some companies with a very centralised approach may have the discipline to really drive home a single master data system across the enterprise (I was speaking to one just the other day) but many do not. For de-centralised companies it may well make sense to consider alternative architectures, as it may simply be impractical or go against the grain of the corporate culture to have a central, harmonised set of operational master data.
Other than that caveat, it seemed like generally sage advice.
Just to let you know that the Information Difference has released its first piece of primary market research, as reported by IT Pro. There are some intriguing snippets in the survey results, as well as some rather more expected results, or the “well, duh” results as Homer Simpson might say.
13% of the (mostly larger) 112 companies surveyed had over 100 systems that hold and maintain customer data, which gives some idea of the scale of the problem that MDM is tackling. It is bit more than “just put in a hub” when you have systems at this level of complexity.
Given the generally flaky level of data quality reported, I found it surprising that nearly a third of the companies in the survey had not purchased an automated data quality tool.
The good thing was that plenty of companies seem to have begin measuring the costs of poor master data, and those costs are high, which should make it easier to justify master data management initiatives.