Flexibility need not imply anarchy

An article by Rick Sherman ponders how data marts are finally being supplanted by enterprise data warehouses, at least according to a new TDWI survey. Yet he muddies the waters by saying that it is quicker to produce an EDW with no data marts than one with data marts, and so suggesting that sometimes, maybe, data marts are still the way to go.

Certainly a central enterprise warehouse without data marts or a decent way of producing them is inevitably going to do nothing to reduce the plethora of departmental data marts, since people will certainly want to get data that is relevant to them, and they’ll do it one way or another if the IT department has too big a backlog. But surely this misses the point. Isolated data marts are a major problem – they can never allow the “single view” which many executives need to take across divisional or departmental boundaries, as without something central there are just too many combinations to ever allow consistency. However it should not be an either/or situation. A modern enterprise data warehouse should be perfectly capable of producing useful data marts on an as-needed basis. The data warehouse cannot afford to sit in glorious isolation or it will fall into disuse, and you will end up full circle with numerous disconnected data marts arising to get around its failings.

An important point not made in the article is that in really large organizations you may not be able to get away with just one warehouse. A company with many country operations may want to have a summary global warehouse as well as one per country. Each of these will have dependent data marts. The “local” data warehouses can feed up summary data to the global one. Indeed for truly huge companies this may be the only practical solution. Examples of such an architecture can be found at Shell, BP, Unilever and others. A major advantage of such a federated approach is that political control of the local warehouse remains within the subsidiaries, which avoids “them v us” issues where the local operating units resist something being imposed on them by central office. In this scenario the local subsidiary get a warehouse that suits their needs, and the central office gets its summary information as a side-effect. Such an approach is technically more complicate, but if you use a technology that is capable of being federated (which these do) then the overhead is not great, but you have the key advantage of retaining global v local flexibility.

A Christmas Message

I must have nodded off in the armchair after a glass or two of seasonal cheer when I was confronted by an apparition – a ghost of data warehouses past. He was not a pretty sight – dating back to the mid 1990s and with a grumpy attitude. In those days all you had was a database and a compiler, and a customer who wanted to somehow bring together data from multiple, incompatible systems. There were no ETL tools, no data warehouse design books and little in the way of viewing the data once you had managed to wrestle it into the new data warehouse, just reporting tools that just about saved you coding SQL but confronted users with the names of the tables and columns, usually restricted to eight characters. Those were the days, a salutary reminder of how primitive things were.

Following this was a ghost of data warehouses present. This was a much chirpier looking fellow, who could gain access to data from even devious and recalcitrant source systems via ETL tools like Ascential, and had at least some idea how to design the warehouse. These designs had suitably festive names: “snowflake schema”, “star schema”. If we had a reindeer schema then it would have completed the festive scene. This wraith had a sackful of reporting tools to access the data, count it, mine it and graph it. Yet not all was well – the pale figure seemed troubled. Every time the source systems changed, the warehouse design was impacted, and teams of DBA elves had to scurry around building tables, sometimes even dropping them. Moreover the pretty reporting tools were completely unable to deal with history: when reports were needed from past seasons there was no recollection of the master data at the time, and it did not reflect the current structures. The only ones really happy here were the DBA elves, who busied themselves constructing ever more tables and indices – they love activity, and they were content.

Last came a ghost of data warehouse future. In this idyllic world, changes in the source systems have no effect on the warehouse schema at all. The warehouse just absorbs the change like a sponge, and can produce reports to reflect any view of structure, present, past or even future. There are less DBA elves around, but they have discovered new things to do with their free time – with SAP up to 32,000 tables these days, there is no unemployment in DBA elf land. Best of all the customers are happy, as they can finally make sense of the data as quickly as their business changes. New master data can be propagated throughout their enterprise like so much fairy dust. I have seen the future of data warehousing, and it is adaptive.

Well, enough of that. Time to talk turkey, or at least eat it. While doing so, it may be relaxing to look back at some of the turkeys that our industry has managed to produce over the years. Human nature being what it is, there is more than one web site devoted to the follies of the technology industry, and even to the worst software bugs. Have fun reading these. I recall being told by a venture capitalist that the dot-com boom of the late 1990s proved that “in a strong enough wind, even turkeys can fly”. These are some that did not.

I would like to wish readers of this blog and very happy Christmas.

Hype or Hope?

In an article today Peggy Bocks of EDS asks whether MDM is all hype. I think it is fascinating how some terms in technology catch on, while others wither away. In 2002 MDM was essentially an unknown term. I recall discussing with analysts calling our new Kalido product offering “Kalido MDM” and being greeted with polite derision, since this was “not a market”. Customers seemed to recognize the problem though, and I discovered that we were not alone when SAP launched their own SAP MDM offering soon afterwards. Though that product is now retired, when a vendor the size of SAP anoints a term then you can be sure that the industry will take it more seriously than when a start-up uses it. Three years on and there is a much higher level of noise around MDM, with around 60 vendors now claiming to have some sort of MDM offering, at least in Powerpoint form. Moreover further validation has come from Oracle (with its Data Hub), Hyperion (who bought Razza) and IBM, who have bought several technologies related to MDM. IDC reckon the market for MDM will be worth USD 9.7 billion within fove years.

However, I do have some sympathy with Ms Bock’s point – a lot of MDM at present is froth and discussion rather than concrete projects. Certainly Kalido has some very real MDM deployments, Razza has some, Oracle presumably does, and SAP managed about 20 deployments before giving up and starting again, buying A2i and retiring its existing offering. Still, outside the tighter niches of product information management and customer data integration (arguably a subset of the broad MDM market) this hardly constitutes a landslide of software deployments.

A skeptic would argue that the industry has got these things wrong before, getting all excited over technology that fizzles out. Remember how “e-procurement” was going to take over the world? Ask the shareholders of defunct Commerce One about that trend. I recall IDC projecting some vast market for object databases in a report they published in 1992, and at the time I wrote a paper at Shell arguing that the ODBMS market would likely never get even close to a billion dollars, and indeed it never did. However object databases always struck me as a solution in search of a problem, whereas the issues around managing master data are very real, and very expensive, for large companies, and they are not well addressed today. There are major costs associated with inconsistent master data e.g. deliveries being delivered wrongly, duplicate stock being held etc. Shell Lubricants though they had 20,000 unique pack-product combinations when in fact they had around 5,000, meaning major savings to be had through eliminated duplication in marketing, packaging and manufacturing, for example.

Because it addresses a real business problem, with the potential for significant hard business savings, I believe that the MDM market will in fact catch light and grow, but there will inevitably be a confusing period while analysts get their heads around the new market and start to segment it, and customers begin to understand the various stages they need to go through in order to run an effective master data project.

Go east young man

Those who think the rise in the Chinese economy can be safely ignored for a few more years or is restricted to cheap toys, sneakers and steel can think again. In 2005 the world’s top exporter of high tech products was not the United States, but China.

This was quite a landmark event, and something that technology companies need to consider carefully. China has become a major force in manufacturing, but is also starting to move into off-shoring. At this stage it lags India by a long way, but 1.2 billion people constitute an awful lot of potential programmers and engineers. At present India has the huge advantage that English is the most common second language, meaning that call centers and programmers can more easily pick up US software skills and communicate with western companies. Also there are far more established technology players in India, but this advantage will not continue forever. Napoleon once said “Let China sleep, for when it awakes it will shake the world” – after 9% economic growth rates for 20 years, it looks like the alarm clock has gone off.

Bangalore and BI don’t mix

An article by Nakis Papadopoulous asserts that offshoring does not work for business intelligence applications, though he spends little time discussing why. Off-shoring has clearly moved beyond the pioneering days. My first experience with it was a project when I was at Exxon back in the 1980s, which in itself tells you that ideas often take a long time to really catch on. With the likes of TCS and Wipro now huge companies in their own right, off-shoring has become fairly mainstream, is it ought to be possible to look at some lessons.

While perhaps any project can be off-shored, there is a spectrum of risk that you need to consider. The big Indian companies have developed a high level of process quality, many of them certified to CMM level 5, which is more than can be said for most western IT outfits (about three quarters of CMM Level 5 IT companies are in India), so there is a major focus on quality. However in order to really feel the benefits of this you are going to need a stable, high quality specification; the CMM process is big on repeatable, documented processes. On a conventional project, questions about the specification can be dealt with by someone wandering across a corridor, but this is more difficult many time-zones and thousands of miles away. Hence the more stable and well understood the requirements, the better the chance of success. A perfect example would be an interface program between two systems. This can (and must be) tightly defined, so there is minimal ambiguity. As IT systems (unlike human beings) always behave consistently, there is a high degree of stability where each “user” is an It system. This type of application would be a low-risk one to outsource.

Building transactional systems with a well-documented set of needs should be only moderately risky, provided the requirements are well-defined, which in many cases they will be. Similarly, testing is something that requires a series of well-documented cases and responses, so again can work well in an off-shore environment.

Even with interfaces, are not guaranteed, though. We had a recent project at a very large company where the extract-transform-load was off-shored. In principle this should be relatively low risk, but it turned out badly since, among other reasons, some of the systems to extract from had complex business rules that required a lot of explanation, and because the target system was a data warehouse, where there can be a lot of changes in the data needs as users refine their understanding for what they can do. Much of this work was brought back on-shore and the project is now live, but only after a scary phase.

At the high-risk end, building user-interfaces where a lot of prototyping is needed would clearly be awkward many time-zones away, as would systems where the requirements are rather loose. This is frequently the case in business intelligence, where the users often are only partly aware of what they can get out of a system until they have seen the actual data. Here it is rare to see bullet-proof functional specification documents, so an off-shore team is at an inherent disadvantage compared to one camped out next to the business users. This is above all the reason why business intelligence projects are some of the least suitable to be off-shored.

I’d be interested to hear of any experiences, good or bad, out there with respect to off-shoring. Does you own experience match that of the above, or not?

One size does not fit all

An article in Computer Weekly today by Martin Fahy contains some excellent insights into why “single instance ERP” is mostly a fantasy. An academic, he has conducted a survey of CFOs and his research has unearthed found some interesting findings. Firstly:

“CFOs and senior finance executives prefer technology architectures that are robust across a wider ranger of organizational circumstances”

Spot on! One of the limitations of ERP systems is that they impose a rigid business model – indeed this was one of the selling points: “sweep away all that inefficient duplication and standardize around a single set of processes”. The trouble with this is that businesses do have unique requirements in particular territories or markets, and are at different stages of development. What may make sense in a mature market such as Germany make not do so in a fast-growing developing market like China. The more widely the scope of a single ERP implementation, the longer it takes to implement and the more functionality that needs to be loaded onto the project. Changing this becomes ever more difficult given its scale, making the business less reactive to changing circumstances since its business processes are essentially frozen.

“CFOs in particular are skeptical about making further investments in second wave ERP until payoffs from the first wave of implementations is realized”

This is a woefully under-discussed topic. Lots of people made money in the yah-fuelled ERP boom of the 1990s: certainly vendors like SAP and Oracle, and definitely the consulting firms that implemented these systems. Yet how often have these systems actually delivered the returns that they promised? Since a pitiful 5% of firms actually carry out regular post implementation reviews (according to Aberdeen) few people really know, but my own experience at two multi-nationals would suggest this is not a topic that executives want to highlight. At one company I am familiar with, an ERP consolidation project is underway that is estimated to cost a billion dollars, and will take eight years. Given the track record of projects that size, it is hard to be optimistic that something won’t change in that timeframe that will affect the project.

“users tend to find ERP functionality cumbersome, complicated and not in keeping with their established work patterns”

Indeed. I have written elsewhere about this.

In my view, CIOs spend too much time worrying about “simplifying” core infrastructure, where benefits are at best difficult to pin down, and not enough on truly value-added initiatives that business people can relate to, such as ones that enable better customer understanding.

The study concludes that “For the foreseeable future single-instance ERP will be a popular rhetoric, but a scarce reality”. Given the real and very high costs of getting there (Nestle, the poster child of single instance ERP reportedly spent USD 3 billion dollars on this) and the ever-elusive payback, one wonders how any of these initiatives ever get signed off at all.

MDM Business Benefits

There were some interesting results in a survey of 150 big-company respondents conducted by Ventana Research as to where customers saw the main benefits of master data management (MDM). The most important areas were:

  • better accuracy of reporting and business intelligence 59%
  • improvement of operational efficiency 27%
  • cost reduction of existing IT investments 8%

It is encouraging that respondents place such a heavy emphasis on business issues compared to IT, since quite apart from this sounding quite right (MDM can improve customer delivery errors, billing problems etc) they will have a much better chance of justifying an MDM project if the benefit case is related to business improvement than the old chestnut of reduced IT costs (which so rarely appear in reality – surely IT departments would have shrunk to nothing by now if all the projects promising “reduced IT costs”over the years had actually delivered their promised benefits). A nice example of how to justify an MDM project can be found in a separate article today, in this example specifically about better customer information.

The survey also reflects my experience of the evolution of MDM initiatives, which tend to start in a “discovery”phase where a company takes stock of all its master data and begins to fix inconsistency, which initially impact analytic and reporting applications. Later, after this phase, companies begin to address the automation of the workflow around updating master data, and finally reach the stage of connecting this workflow up to middleware which will physically update the operational systems from a master data repository. This last phase is where many of the operational efficiency benefits will kick in, and these may be very substantial indeed.

Based on the rapidly increasing level of interest in MDM, in 2006 I expect to see a lot of the current exploratory conversations turning into more concrete projects, each of which will need a good business case. At present MDM projects tend to be done by pioneering companies, so it will be very interesting to see if the various projections prove accurate and MDM starts to become more mainstream.

Visuals for the few

There is a thoughtful article today by Stephen Few in the BI Network journal. In this he discusses the ways in which data can be presented to people, and gives a nice example of how a flashy graphic can be harder to interpret than a simple bar chart. There are some interesting follow-ups to this line of reasoning. Firstly, the vast majority of users of BI software have quite simple data display needs: they probably want to see a trend in data e.g. “are sales going up or down?” or answer a simple question like “what is the most profitable distributor?”. Since the vast majority of BI tool users have no background in data analysis or statistics, it is at best pointless and possible self-defeating to provide them with much in the way of statistical tools or elaborate graphical display capabilities. If they don’t understand statistical significance for example, then they may reach invalid conclusions by playing around with statistical tools.

This may explain why vendors who specialize in advanced data visualization never seem to really make it to any size. There are some very interesting technologies in this area e.g. Fractal Edge,(a technology that seems to me genuinely innovative) AVS, OpenDX and the tackily named The Brain. More established vendors would include SAS, who were one of the first vendors to really go in for sophisticated graphics and statistical tools, yet built up their considerable success mainly in other areas (e.g. they were about the only software that could make sense of an IBM mainframe dump, so became a standard in data centers; they have since diversified greatly). So, two decades after the graphical user interface became ubiquitous, why are there no billion dollar data visualization companies?

I think it is simply that there are not enough people out there whose jobs demand advanced data analysis tools. I have argued elsewhere that the vast majority of business users have no need whatever of ad hoc analytical BI tools. One can debate the exact proportion of business users who need something more than a report. I reckon perhaps 5% based on my experience on data warehouse projects, while I have seen an estimate of 15% from Forrester, so let’s say 10% isn’t far off. Then within this population, who want to at least see some analysis of data, what proportion are serious data analysts and what proportion would find a bar chart more than adequate? I don’t have any hard data here, but let’s go for 10% again as a guess. In such a case then only 1% of potential BI users actually need a sophisticated data visualization or statistical toolset. Indeed this figure may not be so far off, since in order to make serious use of such tools, some background in statistics is probably important, and relatively few people have this.

This would mean that, in a large organization of 10,000 people, there is actually only a market of 100 people for advanced data visualization or statistical tools (data mining tools being one example). Assuming that it is rare to be able to charge more than a few hundred dollars for a PC tool, then even at a thousand dollars a piece our mythical company would only be worth a maximum of USD 100k to a vendor, and that assumes that the vendor could track down every one of these advanced data users, and that every one of them buys the software, an unlikely situation.

If this reasoning stacks up, then while there will continue to be fascinating niche technologies serving aspects of data visualization, we are unlikely ever to see a true mass market for such tools. A picture may tell a thousand words, but there may not be many people in businesses needing to actually produce such pictures.

A good strategy for data visualization vendors would be to avoid trying to be mass market tools and find industry niches where clever graphics really do add a lot of value. For example a tool which improved a stock market trader’s view of the market would presumably be worth a lot more than a thousand dollars. The same would be true of a geophysicist working for an oil company. A good example of a targeted strategy of this kind can be seen in Spotfire, who have carved out a strong niche primarily on the life sciences industry, and seem to be thriving on it.

MDM gets the blues, or at least the blue

In case anyone has any doubt about the reality of the master data management (MDM) market, it is worth noting that IBM has now set up a significant business unit dedicated to MDM, with 1,000 staff, in its vast software group division. This follows a series of acquisitions (Ascential for data movement technology, Trigo for product management, DWL for customer information synchronisation, SRD for identity management).

So far this set of tools still has gaps, though. As I noted elsewhere, BP has 350 different types of master data being managed by KALIDO MDM, and customer and product are just two of these 350 categories. It would seem excessive to expect a customer to buy 348 further technologies once they have bought their CDI and PIM products, so it seems clear to me that a more generic approach to MDM is required than tackling each specific type of data with a different technology. Moreover IBM still lacks a technology to deal with the “analytic” part of MDM, something which can help manage the semantic integration of the various business models which large corporations have, and which contribute heavily to the diversity of master data. Buying piecemeal technologies that tackles specific data-types, however clever they may be (and DWL and Trigo both had excellent reputations) is not going to solve the enterprise-wide problems that large companies face in managing their master data. It seems to me that, while incomplete, IBM has a better grasp of the issues than Oracle or SAP, which has already ditched its first MDM offering, while the SAP MDME solution, based on technology acquired from A2i, has had poor initial feedback from early prospects. “Even worse than the original SAP MDM” is one customer assessment which cannot be encouraging to the German giant. Moreover IBM, with its deliberate abstinence from application software, has the advantage of not being perceived as quite as aggressive as SAP or Oracle. One CIO memorably described IBM as the “beige of the IT industry” meaning that it was neutral and inoffensive compared to many others.

I see there being an evolution in most master data initiatives, the first stage being the analysis of the problem, classifying the various different business definitions that exist for master data in the enterprise; this goes well beyond customer and product e.g. “Asset”, “brand”, “person”, “location” are all important types of master data. The next stage is to document the existing processes for managing change within these categories (mostly manual, involving email) and the governance and authority levels involved e.g. not everyone can authorize the creation of a new brand. This will then require either automation of this workflow, or some process redesign (probably both). Finally the new workflow will need to be linked up to some form of messaging infrastructure e.g. EAI technology, so that changes to the master data can be physically propagated throughout the various operational systems in the corporation. At present there are various technologies around to tackle elements of the problem, but they are far from joined up.

The MDM market is in a nascent state, with people still coming to terms with the issues and trying to piece together where the technology offerings fit. The business problems which it addresses are very real in terms of operational efficiency, so there should be plenty of value there for companies that have compelling offerings. IBM has realized this earlier than most.

Oracle’ struggles to buy growth

It is interesting that Oracle’s buying binge in the last couple of years has not enlivened its share price. As pointed out in a recent Forbes article, Oracle’s price/earnings ratio (a simple measure of how strongly the market views a stock) is now the lowest since 1990. This is despite Oracle’s superb operating margin of 31% (up there with Microsoft and better than SAP’s also excellent 27%). The problem is clearly not profitability but perceived room for growth. Even its core database software license sales were flat in the last quarter. The database business is still very much the jewel in Oracle’s crown, contributing a disproportionate proportion of Oracle’s profits.

The strategic issues are that the database market is somewhat saturated, with Microsoft chipping away at Oracle’s market share with ever its more functional SQL Server, and IBM continuing to revamp DB2. Although insignificant now, the open source mySQL at the least creates pricing pressure, as does SQL Server. Oracle’s applications business has been thoroughly outclassed by SAP. And the Peoplesoft acquisition was very important in order to inject both extra market share and superior technology. Oracle’s grab of retail software vendor Retek from the clutches of SAP was an astute, albeit defensive, move. Over the years Oracle has meandered into a range of other technologies, either by development or acquisition, but rarely with much success e.g. its MPP offerings, its lackluster business intelligence products, etc. This is less surprising, as large software companies usually struggle to diversify, especially the further they move from the area that originally made them successful.

Certainly the wielding of the cheque book has bought Oracle some market share, but it also brings with it a major technology challenge in trying to integrate the various technology platforms of the companies it has acquired in with its already sprawling suite of software. Oracle has long been known for its superb marketing and aggressive sales force, but its pushy tactics have alienated a lot of customers, which in the end must damage it. When household name companies start to contemplate the massive task of moving away from Oracle to SQL Server, not on technical grounds basically because they feel themselves commercially abused, then this indicates a depth of animosity amongst customers which eventually will come home to roost.

Oracle did a fine job of winning the premier slot on the DBMS business, outpacing often superior technology (such as Ingres) through relentlessly effective marketing and sales execution. It remains to be seen whether the trail of upset customers they left along the way will continue to haunt them as they try and bring back growth. Perhaps Larry Ellison should ask the Pythia, the priestesses of Apollo at the original Oracle of Delphi, for some guidance. Their predictions were usually cryptic, but further predictions could always be bought with more gold if the initial ones didn’t meet expectations. Sounds a bit like Oracle’s application strategy: if you can’t build a set of applications that someone wants, then just buy vendors who have. Some things never change.