A multiple view of the truth

Ragy Thomas highlights the plight of marketers in his recent article in DM News. As he points out, marketers are usually poorly served by corporate IT systems. A critical thing for them is to get a good understanding of their customers, is that they can craft carefully targeted campaigns, yet corporate CRM systems have failed to deliver the much talked about, yet elusive “single customer view”. He sets out a sensible approach which is essentially a recipe for master data management in the specific context of customer data. In particular he talks about the need to document the existing data sources, to map the differences between them, to set up processes to improve data quality and then to consider how to deal with integrating and automating the processes.

I would add that such an initiative should sit within the broader context of an enterprise-wide approach to master data management, else we will see a new generation of silos developing, with a customer hub, a product hub, a hub for various other types of data, all in themselves duplicating and so having to be synchronized with data in underlying transaction systems. Nobody wants to see different technologies and approaches used for the marketing data, for the finance master data, for the supply chain data etc. That marketing departments are having to resort to such initiatives shows just how hollow the “single view of the customer” promises from giant application vendors have turned out to be.

Operational BI and master data

In an article in the beye network Mike Ferguson makes an interesting observation which many seem to have missed. A current “theme” is that business intelligence needs to be embedded into operational systems. This innocent sounding notion is of course entirely unrelated to the need for BI vendors to sell more licenses of their software in what has become a slightly saturated market. As I have written about earlier, it is by no means clear that everyone in an organization needs a BI tool, whatever the wishes of BI vendor sales staff. So trying this new tack of bundling BI in with operational systems (which many people do need, or at least have to use) is a cunning ploy. However, as Mike Ferguson notes, if this is done without a basis for common master data and common business rules, then all that will happen is that we will get lots of competing BI reports, all with different numbers and results. The whole notion of data warehousing was created in order to resolve the differences in definitions that exist between operational systems. Each separate ERP implementation will usually have slightly different data, never mind all the other operational systems. By going through a process of trying to get to an enterprise-wide common subset of definitions (Mike calls it a “shared business vocabulary”) then these differences can be understood, and data quality improved back in the source systems. Without such an underlying basis we merely have snapshots of operational data without resolving the data quality issues, and without resolving the inconsistencies. In other words, we will be pretty much back where we were before data warehouses.

There certainly may be valid examples where it makes sense to embed some simple BI capability on the top of operational data, especially in the context of operational reporting where you are only interested in the data in that particular system. However as soon as you want to take a cross functional or enterprise view, then that pesky data inconsistency and data quality has to be dealt with somehow. Putting complex BI tools in the hands of the front-line staff doing order entry is not going to resolve this issue -it may just confuse it further.

Classifying MDM products

In his article on the IBM MDM conference Philip Howard makes the distinction between different MDM approaches, which is useful but I feel does not go quite far enough. As he says, at one extreme you have “hubs” where you actually store (say) customer data, and hope that this is treated as the master source of such data (a non trivial project then usually ensues). At the other end you have a registry, which just lists the various places where master data is stored and maps the links to the applications which act as the master, and a repository, which goes further in picking out what he calls the “best record” and is sometimes called the “golden copy” of data. The distinction between a repository and a regiistry is a subtle one which Bloor makes. I feel that there are other aspects which are useful to categorize though, beyond just the storage mechanism. Some MDM products are clearly intended for machine to machine interaction, synchronizing master data from a hub back to other systems e.g. DWL (now bought by IBM). However there are other products which focus on the management of the workflow around master data (managing drafts, authorizing changes, publishing a golden copy etc), and so deal more with human interaction around master data. Kalido MDM is one example of the latter. This is another dimension which it is useful to classify tools by, since a customer need for synchronized customer name and address records between operational systems is very different from workflow management.

The article notes that IBM does not score well in the recent Bloor report on MDM, but hopes for better things in the future. Certainly IBM did something of a shopping spree once they decided to tackle MDM, and bought a PIM product, a CDI product and a few others while they were at it, so it is perhaps not surprising that it is difficult to see an overall strategy. I absolutely concur with Philip Howard in that MDM needs to be treated as an overall subject and not artificially segmented by technologies that deal with product, customer or whatever. In one project at BP we manage 350 different types of master data, and it is hard to see why a customer can reasonably be expected to buy 348 more technologies to go beyond product and customer. This example illustrates the absurdity of the technology per data type approach which is surprisingly common amongst vendors.

Software is hard to rearchitect, and customers should always look carefully at vendor claims of some overall gloss on top of multiple products, compared to something which was designed to handle master data in a generic fashion in the first place.

Mergers and Measurement

Margaret Harvey points out in a recent article that the effort of integrating the IT systems of two merged companies can be a major constraint and affect the success of the merger. Certainly this is an area that is often neglected in the heat of the deal. But once the investment bankers have collected their fees and an acquisition or merger is done, what is the best approach to integrating IT systems? What is often missed is that, in addition to different systems e.g. one company might use SAP for ERP and the other Oracle, the immediate problem is that the two companies will have completely different coding systems and terminology for everything, from the chart of accounts, through to product and asset hierarchies, customer segmentation, procurement supplier structures and even HR classifications. Even if you have many systems from the same vendor, this will not help you much given that all the business rules and definitions will be different in the two systems.

To begin with the priority should be to understand business performance across the combined new entity, and this does not necessarily involve ripping out half the operational systems. When HBOS did their merger, both Halifax and Bank of Scotland had the same procurement system, but it was soon discovered that this helped little in taking a single view of suppliers across the new group given the different classification of suppliers in each system. To convert all the data from one system into the other was estimated to take well over a year, but instead they put a data warehouse system in which mapped the two supplier hierarchies together, enabling a single view to be taken even though the two underlying systems were still in place. This system was deployed in just three months, giving an immediate view of combined procurement and enabling large savings to be rapidly made. A similar appraoch was taken when Shell bought Pennzoil, and when Intelsat bought Loral.

It makes sense initially to follow this approach so that a picture of operating performance can quickly be made, but at some point you will want to rationalize the operational systems of the two companies, in order to reduce support costs and eliminate duplicated skill sets. It would be helpful to draw up an asset register of the IT systems of the two companies, but just listing the names and broad functional areas of the systems covered is only of limited use. You also need to know the depth of coverage of the systems, and the likely cost of replacement. Clearly, each company may have some systems in much better shape than others, so unless it is case of a whale swallowing a minnow, it is likely that some selection of systems from both sides will be in order. To be able to have a stab at estimating replacement costs, you could use a fairly old but useful technique to estimate application size: function points.

Function points are a measure of system “size” that does not depend on knowing about the underlying technology used to build the system, so applies equally to packages and custom-build systems. Once you know that a system is, say, 2000 function points in size, then there are well established metrics on how long it costs to replace such a system e.g. for transaction systems, a ballpark figure of 25-30 function points per man month can be delivered, which does not really seem to change much whether it is a package or in-house. Hence a 2000 function point transaction system will cost about 80 man-months to build or implement, as a first pass estimate. MIS systems are less demanding technically than transaction systems (as they are generally read only) and better productivity figures can be be achieved here. These industry averages turned to be about right when I was involved in a metrics program at Shell in the mid 1990s. At that time a number of Shell companies counted function points and discovered productivity of around 15 – 30 function points per man month delivered for medium sized transaction systems, irrespective of whether these were in-house systems or packages. Larger projects had lower productivity, smaller projects have higher productivity, so delivering a 20,000 function point system will be a lot worse than a 2,000 function point system in terms of productivity i.e. fewer function points per man month will be delivered on the larger system. Counting function points in full is tedious and indeed is the single factor that has relegated it to something of a geek niche, yet there are short cut estimating techniques that are fairly accurate and are vastly quicker to do that counting in full. By using these short-cut techniques a broadly accurate picture of an application inventory can be pulled together quite quickly, and this should be good enough for a first pass estimate.

There are a host of good books that discuss project metrics and productivity factors which you can read for more detailed guidance. The point here is that by constructing an inventory of the IT applications of both companies involved in a merger you can get a better feel for the likely cost of replacing those systems, and hence make a business case for doing this. In this way you can have a structured approach to deciding which systems to retire, and avoid the two parties on either side of the merger just defending their own systems without regard to functionality or cost of replacement. Knowing the true costs involved of systems integration should be part of the merger due diligence.

Further reading:

Software Engineering Economics
Controlling Software Projects
Function Points

Conferences and clocks

Those who are getting on a bit (like me) may recall John Cleese’s character in the 1986 movie Clockwise, who was obsessed with punctuality. I am less neurotic, but what does distress me is when conference organizers let their schedule slip due to speaker overruns. I speak regulalrly at conferences, and this is a recurring problem. At a conference in Madrid a few weeks ago they managed to be well over an hour behind schedule by the time they resumed the afternoon session, while the otherwise very useful ETRE conferences are famed for their “flexible” schedule. At a large conference this is beyond just irritating, as you scramble to find speaker tracks in different rooms, all of which may be running to varying degrees behind schedule and starting to overlap.

This poor timekeeping is depressingly normal at conferences, which makes it all the nicer when you see how it should be done. I spoke yesterday at the IDC Business Performance Conference in London, which had an ambitious looking 14 speakers and two panels squeezed into a single day. If this was ETRE they would have barely been halfway through by dinner time, yet the IDC line-up ran almost precisely to time throughout the day. It was achieved by the simple device of having a clock in front of speaker podium ticking away a countdown, so making it speakers very visibly aware of the time they had left. I recall a similar device when I spoke at a Citigroup conference in New York a couple of years ago, which also ran like clockwork.

The conference was a case study in competent organization, with good pre-event arrangements, an audio run-through for each speaker on site, and speaker evaluation forms (some conferences don’t even bother with this). The attendees actually bore a distinct resemblance to those promised, both in quality and number; recently some conference organizers seem have had all the integrity of estate agents when quoting expected numbers. The day itself featured some interesting case studies (Glaxo, Royal Bank of Scotland, Royal Sun Alliance, Comet) and a line-up of other speakers who mostly managed to avoid shamelessly plugging their own products and services (mostly). Even the lunch time buffet was edible.

In terms of memorable points, it seems that the worlds of structured and unstructured data are as far part as ever based on the case studies, whatever vendor hype says to the contrary. Data volumes in general continue to rise, while the advent of RFID presents new opportunities and challenges for BI vendors. RFID generates an avalanche of raw data, and a presenter working with early projects in this area reckoned that vendors were completely unable to take advantage of RFID so far. Common themes of successful projects were around the need for active business sponsorship and involvement in projects, the need for data governance and stewardship and for iterative approaches giving incremental and early results. Specific technologies were mostly (refreshingly) in the background in most of the speeches, though the gentleman from Lucent seemed not to have got the memo to sponsor speakers about not delivering direct sales pitches. With Steve Gallagher from Accenture reckoning that BI skills were getting hard to find, even in Bangalore, it would suggest that performance management is moving up the business agenda.

Well done to Nick White of IDC for steering the day through so successfully. If only all conferences ran like this.

Wilde Abstraction

Eric Kavanagh makes some very astute points in an article on TDWI regarding abstraction. As he rightly points out, a computer system that models the real world will have to deal with business hierarchies such as general ledgers, asset hierarchies etc that are complex in several ways. To start with there are multiple valid views. Different business people have a different perspective on “Product” for example: a marketer will be interested in the brand, price and packaging, but from the point of view of someone in distribution the physical dimensions of the product are important, in what size container it comes in, how it should be stacked etc. Moreover, as Eric points out, many hierarchies are “ragged” in nature, something that not all systems are good at dealing with.

The key point he makes, in my view, is that business people should be presented with a level of abstraction that can be put in their own terms. In other words the business model should drive the computer system, not the other way around. Moreover, as the article notes, if you maintain this abstraction layer properly then historical comparison becomes possible e.g. comparing values over time as the hierarchies change. Indeed the ability to reconstruct past hierarchies is something that I believe is increasingly important in these days of greater regulatory compliance, yet it is often neglected in many systems, both packages and custom-built. The key points he makes on the value of an abstraction layer:

– the abstraction layer shields the application from business change
– business-model driven, with the ability to have multiple views on the same underlying data
– time variance built in
– the layer can be a platform for master data management

neatly sum up the key advantages of the Kalido technology, and indeed sums up why I set up Kalido in the first place, since I felt that existing packages and approaches failed in these key areas. It is encouraging to me that these points are starting to gain wider acceptance as genuine issues that the industry needs to better address if it is to give its customers what they really need. To quote Oscar Wilde “There is only one thing in the world worse than being talked about, and that is not being talked about.” I hope these key issues, which most designers of computer systems seem not to grasp, get talked about a lot more.

How to eat an elephant

Robert Farris makes some good observations in his recent article in DM Review. He points out that many companies have ended up with business intelligence being distributed throughout the company e.g. in various subsidiaries and departments, and this makes it very difficult to take a joined up view across the enterprise. As he notes, disjointed initiatives can result in poor investments. Hence it is critical to take an overall view to business intelligence, yet to do so is such a large task that it seems daunting.

In my experience there are a number of things that can at least improve the chances of an enterprise-wide BI initiative succeeding. It sounds like motherhood and apple pie, but the project needs business sponsorship if it is to succeed; IT is rarely in a position to drive such a project. However that statement on its own is of limited help. How can you find this elusive sponsorship?

The place to start is to find the people that actually have the problem, which is usually either the CFO or the head of marketing. The CFO has the job of answering the questions of the executive team about how the company is performing, so knows what a pain it is to get reliable numbers out of all of those bickering departments and subsidiaries. The head of marketing is the one who most needs data to drive his or her business, usually involving looking at trends over time and involving data from outside the corporate systems, and this is usually poorly provided for by internal systems. The CEO might be a sponsor, but often the CFO will be carefully feeding impressive looking charts to the CEO to give the impression that finance is in control of things, so the CEO might not be aware of how difficult data is to get hold of. The head of operations or manufacturing is another candidate, though this person may be too bogged down in operational problems to give you much time. If there is someone responsible for logistics and supply chain then this is often a fruitful area. Sales people usually hate numbers unless it is connected with their commissions (where they demonstrate previously unsuspected numerical ability), and HR usually doesn’t have any money or political clout, so marketing and finance are probably your best bet.

So, you have a sponsor. The next step is to begin to sort out the cross-enterprise data that actually causes all the problems in taking a holistic view, which is these days being termed master data. If you have multiple charts of accounts, inconsistent cost allocation rules, multiple sources of product definition or customer segmentation (and almost all companies do) then it this is a barrier in the way of your BI initiative succeeding. There is no quick fix here, but get backing to set up a master data management improvement project, driven by someone keen on the business side. Justifying this is easier than you may think.

In parallel with this you will want a corporate-wide data warehouse. Of course you may already have one, but it is almost certainly filled with out of data data of variable quality, and may be groaning under a backlog of change requests. If it is not, then it is probably not being used much and may be ripe for replacement. To find out, do a health check. There is a bit of a renaissance in data warehouses these days, and these days you can buy a solution rather than having to build everything from scratch.

In truth your company probably has numerous warehouses already, perhaps on a departmental or country basis, so it is probably a matter of linking these up properly rather than having to do everything from the beginning. This will enable you to take an iterative approach, picking off areas that have high business value and fixing these up first. Once you can demonstrate some early success then you will find it much easier to continue getting sponsorship.

In one of the early Shell data warehouse projects I was involved with we had a very successful initial project in one business line and subsidiary, and this success led to a broader roll-out in other countries, and then finally other business lines came willingly into the project because they could see the earlier successes. This may seem like a longer route to take, but as noted by Robert Farris, this is a journey not a project, and if you start with something vast in scope it will most likely sink under its own weight. Much better to have a series of achievable goals, picking off one business area at a time, or one country at a time, delivering incrementally better solutions and so building credibility with the people that count: the business users.

Elephants need to be eaten in small chunks.

The Informatica recovery story

The data integration market has previously split between the EAI tools like Tibco and Webmethods, and the ETL space with tools like Informatica and Ascential (now part of IBM). The ETL space has seen significant retrenchment over recent years, with many of the early pioneers being bought or disappearing (e.g. ETI Extract still lives on, but is practically invisible now). Mostly this functionality is being folded into the database or other applications e.g. MSFT with SSIS (previously DTS) and Business Objects having bought ACTA. Still in this space are Sunopsis (the only “new” vendor making some progress) and older players like Iway and Pervasive, whose tools are usually sold inside other products. Others like Sagent and Constellar have gone to the wall.

The integration market is surprisingly flat, with Tibco showing 9% growth last year but a 10% shrinkage in license revenues, while Webmethods grew just 4%, with 1% growth in license revenues. Hardly the stuff investor dreams are made of. BEA is doing better, with 13% overall growth last year and 10% license growth, but this is still hardly stellar. Informatica is the odd one out here, having extracted itself from its aberrant venture into the analytics world and now having repositioned itself as a pure play integration vendor. It had excellent 31% license growth and 27% overall growth last year. The logical acquisition of Similarity Systems broadens Informatica’s offering into data quality, which makes sense for an integration vendor. When IBM bought Ascential some pundits reckoned the game would be up for Informatica, but so far that is not proving the case at all.

Microsoft builds out its BI offerings

A week ago Microsoft announced Performance Point Server 2007. This product contains scorecard, planning and analytics software, and complements the functionality in Excel and in its SQL Server Analysis and Reporting Services tools. With Proclarity also within the Microsoft fold now, it is clear that Microsoft is serious about extending its reach in the BI market.

I have argued for some time that rivals Cognos and Business Objects should be a lot more worried about Microsoft than about each other in the long term. Most business users prefer an Excel-centric environment to do their analysis, and as Microsoft adds more and more ways into this it will be increasingly uncomfortable for the pure-play reporting vendors. As ever, Microsoft will go for high volume and low price, so will probably never match BO or Cognos in functionality, but that is not the point. Most users only take advantage of a fraction of the features of a BI tool anyway.

Microsoft is playing a long game here, and the pure-play tools will continue to do well in what is an expanding market. But the ratchet just got tightened another notch.