A bit poor

You may recall my blog on SAP’s farcical claims about its software’s impact on company profitability. It looks like someone with more time on their hands than me actually checked up on the figures and found these lacking, in addition to the lack of logic in the original claim. Nucleus Research, who are noted for their rigor with numbers, found that in fact that SAP customers (identified by being listed on SAP’s web site) were 20% less profitable than their peers, rather than 32% more profitable. Of course this is not quite the same thing, but it is amusing: it suggests that only SAP’s identified reference customers are relatively unprofitable. Perhaps the ones who keep quiet are doing OK? As I noted earlier, the SAP claim was deliberately skewed to exclude all financial institutions (which share the twin characteristics of being highly profitable and rarely using SAP) while anyhow the notion that the choice of your ERP systems provider is a cause of either good or bad profits is both logically flawed and also deeply amusing to those of us who have watched companies spend billions implementing SAP to little obvious effect in terms of hard business benefits.

Good on Nucleus for poking further holes in this especially egregious piece of over-marketing. Bruce Brien, CEO of Stratascope, the company that did the market research for SAP, reacted by sayng:“They’re making an implication that my numbers can’t prove, but it’s a marketing message. Companies do that all the time,” he says. Oh well, that’s all right then.

Cognos recovers somewhat

Cognos announced its full year results, notably seeing a recovery in license revenues to USD 118M in their fourth quarter (i.e. Q1 2006) after the disappointing Q4 2005 results. It was also important to note that the company closed 18 deals over a million dollars in size, which was another marked improvement on the previous quarter. Profit margins were a healthy 18%. Still, license revenue was actually down compared to the same quarter a year ago (USD 130M) while overall revenues at USD 253M for the quarter was slightly down on the same period last year. Actually shrinking is not generally a cause for celebration in a software company, so it is a measure of just how bad Cognos’ previous quarter was that these results were generally greeted with relief.

This (relative) recovery all bodes well for the broader sector, and indicates that Cognos’ stumble at the end of 2005 was to do more with company-specific issues (limited deployment of its new product line) than with any general slow-down in the business intelligence market (which just about every analyst predicts will grow at a healthy clip in 2006). In the medium term, Cognos faces the same issues as other BI suppliers: the relative saturation of the market, and the ever-growing threat from Microsoft.

The ratchet goes up a notch

Back last year I wrote about the creeping progress of Microsoft into the business intelligence arena. In CBR Madan Sheina, (one of the smartest analysts in the industry by the way), examines the latest move in this direction, the SQL Server 2005 suite’s enhanced business intelligence offerings. The new ETL offering SSIS (previously DTS) will be of interest, although its SQL Server ties may limit the take-up of this relative to database-neutral offerings. However the new Analysis Services and Reporting Services promise to ratchet up the pressure on the pure-play BI players, Business Objects, Cognos and the rest. I have long argued that the most ubiquitous BI tool is actually Excel, and that given that people already know this, an ideal BI tool for many users would be one which magically got the data they wanted out of a data warehouse directly into an Excel pivot table. Yes, there will always be a subset of power users for who this is not enough, but in the vast majority of cases this will actually do the trick. Other tools (visualization, data mining etc) would be relegated to niches if this were to happen significant niches perhaps, but niches nonetheless.

Business Objects has done well because of its semantic layer, the “universe”, which overlays something closer to a business view on top of data marts and warehouses; this imposes some maintenance overhead but this is acceptable to users since it represents the data in a more business-like form. However Business Objects has always struggled with its OLAP capability relative to competitors. Cognos by contrast, had the best OLAP tool out there in Powerplay, but a rather ordinary reporting offering. These two vendors pretty much carved up the market between them, though in a growing market there was enough room for other tools like Microstrategy, Actuate etc as well. Microsoft’s new suite poses a potent threat to most of these BI vendors, since most users do not use more than a tiny fraction of the features of a BI tool, so adding more features just to stay ahead of Microsoft is ineffective; the end users simply don’t need more features. With its low price point and “good enough” features, the Microsoft tools are likely to gradually eat into the market share of the independent vendors. Nothing dramatic will happen overnight, and the curious restraint of Microsoft from serious marketing of its tools to the enterprise will also slow progress. What was the last time you saw a webinar or advert for Analysis Services? Compare and contrast with Business Objects, which is a marketing machine.

However, just like a pack of hunting dogs wearing down a large prey animal, the Microsoft tools can just edge up on the BI vendors in reach with each release, secure in their Office base that they control what users really want: Excel.

Iteration is the key

Ken Pohl writes a thoughtful article on the issues of project management of a data warehouse project, and how this can differ from other IT projects. As he points out, a data warehouse project is unusual in that it is essentially never finished – there are always new sources to add, new types of analysis the customers want etc (at least there are if the project is a success: if it failed then at least you won’t have too many of those pesky customer enhancement requests).

As the article points out, a data warehouse project is ideal for an iterative approach to development. The traditional “waterfall” approach whereby the requirements are documented at ever greater levels of detail, from feasibility through to requirements through to functional specification etc is an awkward approach. I have observed that in some companies the IT departments have a rigid approach to project management, demanding that all types of projects follow a waterfall structure. This is unfortunate in the case of data warehouse projects, where end-users are often hazy on requirements until they see the data, and where changing requirements will inevitable derail the neatest functional specification document (see diagram).
Given a 16 month average elapsed time for a data warehouse project (TDWI) it is almost certain that at least one, and possibly several, major changes will come along that have significant impact on the project, which in a waterfall approach will at the very least cause delays and may put the entire project at risk.

By contrast a data warehouse project that bites off scope in limited chunks, while retaining a broad and robust enterprise business model, can deliver incremental value to its customers, fixing things as needed before the end users become cynical, and gradually building political credibility for the warehouse project. Of course the more responsive to change your data warehouse is the better, but even for a traditional custom build it should be possible to segment the project delivery into manageable chunks and deliver incrementally. The data warehouse projects which I have seen go wrong are very often those which have stuck to a rigid waterfall approach, which makes perfect sense for a transaction processing system (where requirements are much more stable) but is asking for trouble in a data warehouse project. Ken Pohl’s article contains some useful tips, and is well worth reading.

Unifying data

I can recall back in the early 1990s hearing that the worlds of structured and unstructured data were about to converge. A decade on, and despite the advent of XML, and that prospect still looks a long way off. It is like watching two people who have known each either for years and are attracted to each other, yet never seem to find a way of getting together. Some have argued that the data warehouse should simply open up to store unstructured data, but does this really make sense? When DBMS vendors brought out features allowing them to store BLOBS (binary large objects) the question should have been asked: why is this useful? Can I query this and combine it usefully with other data? Data warehouses deal with numbers (usually business transactions) that can be added up in a variety of ways, according to various sets of business rules (such as cost allocation rules, or the sequence of a hierarchy), which these days can be termed master data. The master data gives the transaction data “structure”. A Powerpoint slide or a word document or an audio clip tends not to have much in the way of structure, which is why document management systems place emphasis on attaching keywords or tags to such files in order to give them structure (just as web pages are given similar tags, or at least they are if you want them to appear high up in the search engines).

You could store files of this type in a data warehouse, but given that these things cannot be added up there is little point in treating them as transactions. Instead we can consider them to be master data of a sort. Hence it is reasonable to want to manage them from a master data repository, though this may or may not be relevant to a data warehouse application.

I am grateful to Chris Angus for pointing out that there is a problem with the terms ‘structured data’ and ‘unstructured data’. Historically the terms came into being to differentiate between data that could at that time be stuffed in a database and data that could not. That distinction is nothing like as important now and the semantics have shifted. The distinction is now more between data constrained by some form of fixed schema and whose structure is dictated by a computer application v data/documents not constrained in the same way. An interesting example of “unstructured data” that is a subject in its own right and needs managing is a health and safety notice. This is certainly not just a set of numbers, but it does have structure, and may well be related to other structured data e.g. HSE statistics. Hence this type of data may well need to be managed in master data management application. Another example is the technical data sheets than go with some products, such as lubricants; again, these have structure and are clearly related to a traditional type of master data, in this case “product”, which will have transactions associated with it. Yet another would be a pharmaceutical regulatory document. Hence “structure” is more of a continuum than a “yes/no” state.

So, while the lines are blurring the place to reconcile these two worlds may not be in the data warehouse, but in the master data repository. Just as in the case of other master data, for practical purposes you may want to store the data itself elsewhere and maintain links to it e.g. a DMBS might not be an efficient place to store a video clip, but you would want to keep track of it from within your master data repository.

Microsoft MDM? Don’t hold your breath

At a conference this week at which Microsoft explained how it intends to unify its rambling applications offerings, Mike Ehrenberg (architect for Microsoft’s MBS products) mentioned that Microsoft was “investigating”an MDM product offering. It should be said that Microsoft should be in an excellent position to understand the problem of inconsistent master data, at least within their own portfolio of business software products. Through a series of acquisitions they have assembled no less than five distinctly overlapping products for SMEs, and have manifestly failed to explain how any of this resembles a strategy. This mess has enabled innovative newcomers like Ataio make steady progress in what should really be Microsoft’s natural turf, as customers have been bemused by Microsoft’s seeming inability to articulate which technologies they were really intending to invest in. The answer, it seems, is all of them – MSFT will “converge” their five products “no sooner than 2009” (unofficially, 2011 is a target date I have heard from an insider). The most amusing line in the article was: “The MBS products, Gates said, “have more head room for growth than just about any business we’re in.” This is about as backhanded a compliment as one can think of: I have heard that Microsoft management is very unhappy about the lack of progress in this division, so this comment is like saying to a sports team that just came bottom of the league “we now have more room to improve than anyone”.

Microsoft seems perennially to struggle in the enterprise software market, despite its vast resources, huge brand and marketing clout. It essentially stumbled into the DBMS marketplace; I have it on good authority that Gates originally approached Larry Ellison with a view to bundling Oracle as the DBMS on Windows NT, and it was only after being spurned that Microsoft decided to launch SQL Server out of the ashes of the Sybase code-base it had purchased (this is a piece of hubris that Oracle may live to regret). In Excel and Analysis Services Microsoft has the most ubiquitous business intelligence software out there, yet has hardly any mind-share in this market. Perhaps it is just not in Microsoft’s DNA to really relish the enterprise software market, when its business model is above all about high volume, and large enterprises demand endless tinkering and specialization of software to their specific needs.
Based on the train-wreck that is Microsoft’s enterprise applications strategy, I wouldn’t count on a strong MDM product entry any time soon.

Data warehouse v master data repository

Bill Inmon notes that “Second-generation data warehouses recognize the need for tying metadata closely and intimately with the actual data in the data warehouse”. This is indeed a critical point, and is at the heart of why all those enterprise data dictionary projects in the 1990s (and even 1980s; sad to say I am old enough to have been involved with one in the 1980s) failed. Because the dictionaries were just passive catalogs, they were of some use to data modelers but otherwise there was little incentive to keep them up to date. In particular, the business people could not see any direct benefit to them, so after the initial project went live the things quietly got out of date. In order for such initiatives to succeed it is critical that the business metadata (more important than the technical metadata) is tied into the actual instances of master data, so that the repository does not just list the product hierarchy structure (say) but also lists the product codes that reside within this structure. Ideally, the repository would act as the primary master source of master data for the enterprise, and serve up this data to the various applications that need it, probably via an automated link using middleware such as Tibco or IBM Websphere. Not many companies have taken it to this stage, but there are applications at BP and Unilever that do, for example.

However one important architectural point is that you may not want the data warehouse to actually manage all the master data directly; instead it may be better to have a separate master data repository. The reason for this apparently odd approach is that in a data warehouse you want the data to be “clean” i.e. validated, conforming to the company business model etc. On the other hand master data may have separate versions, drafts (e.g. draft three of the planned new product catalog) that need to be managed, and potentially “dirty” master data that is in the process of being improved or cleaned up. Such data has no place in a data warehouse, where you are relying on the integrity of the numbers.

Hence a broader picture may see an enterprise data warehouse alongside a master data repository, the latter feeding a “golden copy” of master data to the warehouse, just as it will feed the same golden copy to other applications that need it. With such an approach, and current technology, those old enterprise modeling skills might just come in handy.

Incidentally, spring is definitely in the air in Europe. The sun is out in London, there is a spring in people’s step, and the French have called a general strike.

Putting lipstick on a caterpillar does not make a butterfly

Oracles recent repackaging of its BI offerings appears to be just that: a repackaging of existing technologies, of which of course they now have a lot. Peoplesoft had EPM, which had a mediocre reputation, but they did better with Siebel, who had astutely acquired nQuire, a good product that was relabeled Siebel Analytics. Oracle also has Discoverer, a fairly blatant rip-off of Business Objects, a series of pre-built data marts for Oracle apps as well as assorted older reporting tools developed along the way, like Oracle ReportBuilder, which seems to me strictly for those who secretly dislike graphical user interfaces and yearn for a return to a command prompt and “proper” programming. This assortment of technologies has been placed into three “editions”, but you can scour the Oracle website in vain for anything which talks about the actual integration of these technologies at anything below the marketing/pricing level. Hence it would seem that customers will still essentially be presented with a mish-mash of tools of varying quality. Perhaps more R&D is in the works to integrate the various BI offerings properly, but it seems that for now Oracle still has some work to do in presenting a coherent BI picture. Business Objects and Cognos will not be quaking in their boots.

When did “tactical” become a dirty word?

A new report from Butler Group bemoans the “tactical” use of business intelligence tools on a piecemeal, departmental basis, calling for an enterprise-wide approach. However it rather misses the point about why this state of affairs exists. The author reckons “Business will only be able to improve its information services, and obtain real value from the ever-increasing data silos that it continues to generate, when it accepts the significant advantages to be gained from integrating and standardizing its approach to the management of its BI technology services.” Or, to paraphrase: why on earth are you deploying separate departmental solutions, you bunch of dimwits?”

As I have discussed previously on this blog. There are actually several reasons why most BI initiatives are departmental, often using different tools. It is not that the business people are a crowd of masochists. The first reason is that a lot of BI needs are legitimately local in nature, specific to a department or operating unit. It is dramatically easier for a department to set up a data mart that has just its own data, and stick on top of that a reporting tool like Business Objects or Cognos, than it is to wait for the IT department to build an enterprise warehouse, which takes 16 months on average to do, costs 72% of its build costs every year to support, and then usually struggles to keep up with changing business requirements.

So it is not a matter of “accepting the significant advantages” of an enterprise approach. Everyone accepts that such an approach would be preferable, but the IT industry has made it very, very difficult to actually deliver in this promise, and people naturally fall back on “tactical” (i.e. working) solutions when grandiose ones fail. Ideally you would like an enterprise data warehouse, deployed in a few months, that can deal with business change instantly, and can at the same time both take an enterprise view and respect local departmental business definitions and needs, which will differ from those of central office. The trouble is, most companies are not deploying data warehouses like this, but are still stuck in a “build” timewarp, despite the existence of multiple packaged data warehouses which can be deployed more rapidly, and in at least one case can deal with change properly. Until this mindset changes, get used to a world with plenty of “tactical” solutions.

The data warehouse market breaks into a trot

The latest figures from IDC (who, by the way, are by far the most reliable of the analyst forms when it comes to quantitative estimates) is that the data warehouse market will grow at a 9% compound rate from now through to 2009, reaching USD 13.5 billion in size (up from USD 10 billion today), as reported in an article on the 17th of March. Gartner also reckon that this market is growing at twice the pace of the overall IT market (their estimates are slightly lower, but would trust IDC’ more when it comes to figures). It would be interesting to see the proportion of this that is packaged data warehouse software (see the recent report by Bloor) but unfortunately they don’t split out the data in this way. This figures does not include services, but based on other analyst estimates this market is at least three times this size; there never seems to be any shortage of need for systems integrators.

Given all the billions spent on ERP systems in the last ten years or so, it is about time that more attention was paid to actually trying to make sense of the data captured in these and other transaction processing systems, which for a long time have consumed the lion’s share of IT development budgets. After all, there is likely to be more value in spotting trends and anomalies in the business than in merely automating processes that were previously manual, or in just shifting from one transaction processing system to another.