Orchestrating MDM Workflow

France is rarely associated with enterprise software innovation (test: name a French software company other than Business Objects) but in MDM there are two interesting vendors. I have already written about Amalto, but the more established French MDM player is Orchestra Networks. Founded in 2000, this company has been selling its wares in the French market since 2003, and has built up some solid customer references, mainly in the financial services arena but also with global names such as Sanofi Aventis and Kraft.

The great strength of their EBX technology is the elaborate support for complex business process workflow, an area neglected by most MDM vendors. For example a customer may have an international product code hierarchy, and distribute this to several regions. Each of the regional branches may make local amendments to this, so what happens when a new version of the international hierarchy is produced? EBX provides functionality to detect differences between versions or branches and to allow for merging of these versions, supporting both draft “project” master data and the production versions, keeping track of all changes and supporting the workflow rules to support the full life-cycle of master data creation and update.

Typically such functionality is delivered with only by PIM vendors (Kalido is an exception), yet EBX is fully multi-domain by design, so is not restricted to any one class of master data. This will give it an advantage in competitive situations with vendors who have historically designed their technology around one type of master data (customer or product) and are only now realising the need to support multiple domains.

So far Orchestra Networks has confined itself to France, but opens its first overseas office in London soon. The company has taken the time to build out its technology to a solid level of maturity, and has productive partnerships with Informatica (for data quality and ETL) and Software AG, who OEM EBX and sell it globally at the heart of their own MDM offering.

In my own experience of MDM projects, the handling of the business processes around creating and updating master data is a key issue, yet most hub vendors have virtually ignored it, assuming this is somehow “out of scope”. Hub vendors typically focus on system to system communication e.g. validating a new customer code by checking a repository, and perhaps suggesting possible matches if a similar name is found. This is technically demanding as it is near real-time. However human to system interaction is also important, especially outside the customer domain, where business processes can be much more complex. By providing sophisticated support for this workflow Orchestra Networks can venture into situations where CDI vendors cannot easily go, and as I have written previously there are plenty of real business problems in MDM beyond customer.

It will be interesting to see how Orchestra Networks fares as it ventures outside of France in 2008.

Labours of Hercules

If you want to understand the key issues around an MDM project then I would encourage you to read an excellent case study that has just been published in CIO magazine. It discusses a major project to try and sort out the master data at Nationwide Insurance (one of the largest US insurers). The case study illustrates the kind of organisational and business decisions that need to be made in order to succeed with this type of project. It is an unusually detailed write up of how a company with fourteen general ledgers, seventeen finance data warehouses alone and 300,000 spreadsheets took a root and branch approach to radically improving the situation of their master data, and by and large seem to have succeeded.

It is also daunting reading in one way, as it shows the level of business commitment that is required to sustain a project of this scale. Anyone thinking that a purchase of an MDM tool and a small project team with a million dollars or so of budget will do the job needs to read this case study. I have seen a few serious MDM projects in my time but this was certainly one of the more ambitious ones.

Finally, a very happy Christmas to you, my readers. Have a lovely holiday.

The Gaul of it

I came across an interesting new MDM vendor recently called Amalto, a start-up from Paris (though they already have a California office). They have only been selling their software for less than a year, but already have a good set of early customers, such as Rio Tinto, Total, SNCF and BNP Paribas. Their Xtentis product offers a generic MDM repository with data movement (EAI like) functionality, and they make heavy use of standards (Eclipse, Ajax etc). Unusually, they use an XML database rather than a relational database as their underlying storage mechanism. Given the relatively low data volumes typical in MDM applications, this approach seems interesting, since XML databases are strong at handling data with complex structures (e.g. variable depth hierarchies) that one often encounters in master data. In case you think XML databases are unproven, Berkeley DB is probably the most widely deployed DBMS in the world, being embedded in many mobile phones, for example, and most phone users don’t have deep DBA skills. On a parochial note, it is nice to see a European software company emerging for a change (another MDM vendor is Orchestra Networks, also French).

Though an early stage company, Amalto is making good progress in the French market and in 2008 will start to expand to the USA. If they can firm up their positioning (confusingly, they also have a product for B2B exchanges, a quite different market, resold by Ariba) and develop good systems integration partnerships in the US then they should be an interesting addition to the MDM space. Their technology is innovative and their early customer stories sound promising.

Posing questions

The recent spate of acquisitions in the BI world (Cognos by IBM, Business Objects by SAP) might cause you to assume that the area was becoming mature (for which read: nothing much new to do). However there is still innovation going on. A company called Tableau, formed mainly by some ex-Stanford University people (including one who was an early employee at Pixar and who has two Oscars to his name!) has neatly combined BI software with clever use of visualisation technology. I have written before how visualisation has struggled to break out of a small niche, though there are certainly some clever technologies out there (e.g. Fractal Edge). One thing that Tableau has done well is to make a very well thought out demo of their software. Product demos are often dull affairs, but this one is very engaging (if a little frenetic), with some real thought put into the underlying data in order to show off the tool to good effect.

I still firmly believe that only a limited proportion of end users actually need a sophisticated analysis tool of any kind. In my experience of BI projects, end users generally find the leading BI tools a lot less intuitive than the vendors would like to think they are, often resorting to Excel once they have found the data they need. The type of technology that Tableau is developing provides an interesting alternative to the established players and has the potential to engage a certain subset of users more. I will follow their progress with interest.

Never mind the quality, feel the width

Frank Buytendijk (ex Gartner analyst, now with Oracle) makes an importantpoint about data quality on his blog: it is inherently dull. This in itself causes problems both to people within organisations who care about data quality (there must be a few of you out there) and for data quality vendors, who struggle to sell their products at a decent price point in sufficient numbers. I have written about this before, in which I pointed out just a couple of real life cases of poor data quality that I have personally encountered, each of which cost many millions of dollars.

The reason that data quality is generally excellent in the area of salary and expense processing is that people care deeply about what they get paid, and you can be pretty sure than any clerical errors get spotted and complained about very quickly. However in most cases data quality occurs due to people being asked to enter or maintain data for which they see no personal or even obvious company benefit. Data that is useful for “some other department” is never going to receive the same care and attention that your own personal expense claims get.

As Frank says, in order to move data quality higher up the enterprise priority list, it needs to widen its perspective: move beyond talking about customer names and addresses. Yes, this is important if you are doing mailshots, and certainly poor customer name and address management can have more serious consequences, but most executives have got better things to do than worry about whether their mailshots are being duplicated.

Despite numerous acquisition over the years (First Logic, Similarity, Vality, …) there are still plenty of small data quality vendors out there, some with very interesting technology. Yet aside from Trillium, few have managed to get even into double figures of millions of revenue. This is not due to an absence of a real problem to address.

Some data quality vendors rightly see master data management as a way of repositioning their offerings in a more fashionable area, but they need to realise that data quality is just a feature of a complete MDM solution. Hence they need to partner with broader-based MDM repository vendors who themselves often lack proper data quality technology, rather than pretending they themselves are a complete solution. They should also do a better job of highlighting quantified customer dollar benefits achieved from the use of data quality technology. This should not be hard to do since data quality projects usually have excellent payback. Yet time after time the example used in data quality collateral are the tired name and address cleanup, followed by an esoteric discussion about whether probabilistic or deterministic matching is better (paying customers don’t care – they are interested in what benefits they see). Far too few data quality case studies mention hard-dollar benefits to the customer.

Data quality should have much going for it: it is a very real problem, the condition of data quality in most large organisation is horrible (and far worse than generally realised), and the costs of this are significant and cause genuine and in some cases very serious operational problems. Yet the industry as a whole has done a poor job of explaining itself to the people with the cheque books in enterprises.