Frying pans and fires

For some time now I have been impressed with the analysis of Philip Howard of Bloor, and today he came out with another well-written article that is long overdue:

As regular readers of this blog know, I have been arguing for some time that the MDM industry has been digging a pit for itself by the way it has ended up with silos of CDI and PIM technology in addition to more generic MDM offerings like Kalido. Phil eloquently points out the shortcomings of such an approach. By implementing solutions only designed for one particular type of master data (say, customer) companies are recreating the silos that they have today, but in a different form. At present the issue is that master data is locked up in mutliple monolithic ERP systems and assorted others, and the point of MDM is to reconcile these multiple master data versions in a master data repository which can keep track of them and manage the process of updating and creating new master data i.e. improving long term master data quality. By setting up a CDI hub, and then a PIM product, and then another repository for the next type of master data in vogue, we will move from having multiple ERP systems holding duplcated and inconsistent master data to multiple, incompatible MDM repositories doing the same. Indeed in some ways the situation will be worse, since the ERP systems are not going away and will still hold (and probably create) their own master data in addition. Master data is not just customer and product: you have to be concerned about assets, people, production facilities, brands etc. In one application at BP Kalido manages 350 separate master data types.

It is key that a master data repository is designed from the ground up to handle multiple datatypes. Buying solutions from a large vendor will not help, as Phil points out in his article: Oracle have multiple solutions, SAP’s MDM product is really an acquired catalogue management product that anyhow does not work properly right now, while with IBM you get the delight of buying several acquired technologies with different approaches and hiring IBM consultants to figure out the mess. My fear is that the menagerie of tools that the industry is delivering will stunt the growth of MDM in general, as customers realise that they have leapt from the frying pan into the fire. Few commentators seem to acknowledge this issue, so well done to Phil for weighing in.

I’d also like to wish all my readers a very happy Christmas! Have a good break.

It was this big

Market sizing is a slippery thing. Just how big is the MDM market, for example? Well, it all depends on what you include and what you exclude, which is why answers like “it is $x” are not that useful in themselves. Does the figure include software only, or also services like associated consulting? Since for any IT project services can be several times the prices of software, this matters. Moreover, within MDM have they chosen the pure-play MDM vendors only, or thrown in PIM and CDI solution providers? This is the kind of issue that the new press release for Arc Advisory Group’s estimate of MDM market size omits, rendering the quoted figure of USD 680M in 2006 somewhat meaningless on its own. Whatever the real figure is, they reckon it grew 30% in 2006 over 2005 and will essentially double by 2011. Remember that the term MDM barely existed before 2004.

As a point of reference, IDC put the MDM market at USD 5 billion in 2005, expecting it to growing at 14% annually. There is a big difference between USD 5 billion and USD 610 million, which just shows how careful you need to be when taking these analyst figures blindly. The true size will remain a complete mystery until these kinds of press releases spell out what is included and not, and what methodology was used, which at least allows informed debate.

My personal take is that the pure-play MDM software market is actually pretty small. Even throwing in CDI solutions like Siperian and DWL (now part of IBM) as well as pure-play MDM solutions like Kalido, it is hard to get a really big revenue figure if you aggregate the revenues of these vendors e.g. DWL’s revenues were less than USD 20M when bought by IBM. Throw in some sales for Oracle and SAP, but not that much, since early in 2006 Gartner reckoned Oracle had maybe 10 customers only for its CDI solution. SAP similarly has a small number of customers for its troubled MDM offering. Hence it is hard to see where even the Arc figure comes from. IDC are usually pretty thorough when it comes to their numbers, but they must have included a lot of related things to get to their figure. Even chucking in the data quality vendors still won’t get the figure that high, since even the biggest data quality vendors have revenues of about USD 50M.

My perception is that the interest level in MDM is very high but the deployed dollars in software solutions for true MDM (i.e. ignoring data quality and assorted wannabees in this field) is as yet very small, maybe of the order of USD 100M in 2006. Anyone with any insights out there feel free to chip in.

The tortoise and the hare

Business intelligence applications typically deal with data that is already stored, often in a depressing number of places i.e. a number greater than 1. Much BI data is of its nature not real-time e.g. looking at monthly averages or trends. However at the other end of the spectrum there are some applications that are truly real-time, and not just in the sense that a marketer puts the term in a brochure.

An interesting start-up in this area is StreamBase, which specialise in genuinely real-time applications, such as trading systems but also inventory monitoring and anti-fraud applications. StreamBase provides StreamSQL, which essentially extends SQL to a real-time environment, a run-time engine as well as a graphical developer environment that allows transformation logic to be written. For example you might have a need to compare the current stock price of an equity to a competitor, and take some action e.g.”buy” if the price hits some threshold related to the competitor. Such applications would occur in memory, but StreamSQL also provides in-memory hash tables and also an embedded database in case you want to persist data (temporarily or permanently, respectively). To continue the trading example, you might want to take an action based on the stock price relative to its average over the last month, for which you would need to store some data temporarily in order to carry out the calculation.

Set up in 2003 by database luminary Mike Stonebraker (who founded Ingres and Illustra) the company has now done two venture rounds, including a series B round led by premier league VC Accel. They have over 60 employees and 50 customers, though this includes pilot customers i.e. not all these are fully paying yet. Public customers include Goldman Sachs and Bridgewater, a leading hedge fund. The company is cagey about revenues but assures me that they are growing.

StreamBase plans to provide in its product roadmap easier integration of real-time and historical data i.e. more StreamSQL enhancements, continued performance enhancements and improved ease of programmability. Two OEM agreements are already in place.

Competition is mostly in-house coding, though there are some vertical point solutions (Progress software in trading) and there is potentially some overlap with EAI tools at some point. For example, StreamBase partner with Tibco Rendezvous but there is an offering of Tibco for event processing that could potentially compete. From a marketing viewpoint the relationship to EAI and middleware tools will need to be carefully stressed as these tools themselves develop. However the company has certainly picked an attractive niche to operate in, and its high quality VC backers and experienced management will make it a credible player.

Santa comes early for HP

In a surprise move HP has snapped up Knightsbridge in a move to bolster its technology services business.  Knightsbridge had carved out a strong reputation for handling large data warehouse and BI projects for US corporations, and had grown to over USD 100M in revenue.  It was up with IBM as one of the two leading data warehouse consulting organisations.  This in itself makes it clear why it was attractive to HP, who do not have anything like such a strong reputation in this area.  Knightsbridge was growing strongly in 2006, and the financial terms of the deal are not public, but one would assume HP paid a good price for such a good business.  This will no doubt provide a happy retirement for the Knightsbridge founders, but it is less clear as to how well the Knightsbridge culture, which was quite fiercely vendor-independent, will sit within a behemoth like HP, which has its own technology offerings.  It was revealing that Knightsbridge CEO Rod Walker had dismissed service company acquisitions in an interview just a year ago, and for reasons which sounded pretty sensible.   No doubt this will present an interesting spin challenge for the Knightsbridge PR staff, but perhaps they will have other things on their minds, such as dusting off resumes.

“If the cultures of the two companies are not a near-perfect match, people will leave, and services is a people business.”  I couldn’t have put it better myself Rod.

Data quality savings gone missing

One thing that continues to surprise me is how little developed the business case for data quality and master data management is.  When I look at data quality vendors speaking at conferences I can sit through whole sessions which do not mention the amount of actual dollars their clients saved by using their technology. In the case of MDM there is some excuse for this, since MDM as a term only recently became mainsteam, and so few vendors have real projects that are in production with clients.  Indeed just 4% of companies have completed an MDM project, according to a recent survey by Ventana (though 37% claim to have initiated a project).  However in the highly related field of data quality there are no such excuses: tools have been around for years, and yet trying to find examples of well justified projects with a hard dollar payback is like pulling teeth.

While data quality has remained something of a backwater (the largest data quality vendor does around USD 50M in revenue) it is surely one of the things that should be relatively easy to produce a cost benefit case for.  After all the tools will enable you to detect the proporton of bad data in a given application or enterprise, and it should not be beyond the wit of man to be able to assign a cost of poor data quality.  Even ignoring tricky things like customer satisfaction, poor data causes very real things: deliveries going to wrong places, misplaced inventory, incorrect payments, problems in manufacturing.  In certain industries it can be worse: drilling an oil well in the wrong place is an expensive affair, for example.  An 2003 AT Kearney study showed that USD 4 was saved for every dollar spent on data cleansing activity. 

By going back and looking at completed projects and carrying out cost/benefit analysis the data quality (and MDM) vendors will be doing themselves a favour, since by quantifying the savings these projects bring they can not only make it easier to justify new projects, but they beging to justify the price of their products: indeed they may be able to gain improved pricing if they can demonstrate that their products bring sufficient value to customers. It is a mystery to me as to why vendors have made such a poor show of doing so.


Truth and myth

Malcolm Chisolm has penned a thoughtful article which argues that there essentially will never be a “single version of the truth” in an organisation of any size.  As he rightly points out, beyond a single related group of users e.g. in accounts or marketing, it is very difficult indeed to come up with a definition of a business term that is unambiguous and yet also satisfies everyone.  Which costs actually are counted in “gross margin”?  Is a “customer” someone who has signed a contract, has been shipped goods, been invoiced or has paid?  These examples become vastly more difficult when considering a global enterprise operating in lots of countries.  If it is hard to get production, marketing and finance to agree on a definition within the same office, what are your chances of getting agreement between 50 countries?  A TDWI survey some time ago showed how far away companies are from fixing this, and this survey was of US companies rather than multinational ones.  

This issue is at the heart of master data management, and is why MDM is a lot more than putting in a “customer hub”.  Managing the inevitable diversity of business definitions, ensuring that they are synchronised between systems, dealing with changes to them and providing processes to improve the quality of master data is what an MDM project should address.  A technical solution like an MDM repository or series of hubs is part of providing a solution. but only a part.  Significant organisational resources and processes need to be constructed and applied to this issue.  Even when these are done, it is a journey rather than a destination: data quality will never be perfect, and there will always be business changes that will throw up new challenges to maintaining high quality, synchronised master data.  However, the sooner that this message gets through the sooner organisations can start to really begin to improve their master data situation rather than just plugging in the latest technological silver bullet.



A step in the right direction

As regular readers of this blog are aware, I believe that one of the potential trends in business intelligence will be towards “on demand” offerings.  This model avoids the hassle of installation at your own site (with all the complexities of combinations of operating system, database and middleware software versions that implies).  Since it is usually is accompanied by a rental pricing model, this makes it easier to try out from the customer’s viewpoint, whilst from the vendor’s viewpoint this often avoids triggering the dreaded procurement review that bogs down so many sales cycles.  As well as new vendors entering the space, Business Objects has just made a tentative step in this direction by acquiring NSite, a small vendor essentiially providing complementary offerings to the offering. 

I  think that this purchase is less for the application that Nsite has than for the experience that it has in offering software as a service i.e. Business Objects is mostly buying a team of software engineers with relevant experience.  This is probably a good idea, since for all its sales and marketing clout, R&D has been a consistent weak spot for Business Objects over the years.  Judging from the press release it looks like Steve Lucas is behind the acquisition; Steve is a smart guy and understands the need for Business Objects to improve its offerings.