Babies and bathwater

I read a rather odd article in Enterprise Systems proclaiming that “the data warehouse might be dead”. The thrust of the article was the old idea of piecing together reports for a particular user or department by accessing source systems directly rather than bothering with a pesky data warehouse, in this case advocated by a senior business user from Bank of America. I understand the frustration that business people have with most corporate data warehouses today. They are typically inflexible, and so unable to keep up with the pace of business change. Indeed this thought was echoed recently by Gartner analyst Don Feinberg, who said that data warehouses more than five years old should be re-written. To a person who has an urgent information need, going to an IT department that tells him that he cannot have the information he needs for weeks or months is understandably irritating.

Yet the apparent solution of accessing source systems directly is flawed, and the problems this causes is after all is why people invented data warehouses in the first place. Yes, you can patch together data from source systems, and yes, that one report you want may appear less complicated (and quicker) to get than going through a change request in corporate IT. Overall, though the economics of point to point spaghetti v a central warehouse are easy to see. The organisation as a whole will spend much more money in this manner than by having a central warehouse where the data can be relied upon; the more complex the organisation, the larger this gap will be.

Probably worse, the business user gets some numbers out, but are they the right numbers? Anyone who has worked on data warehouse projects will be familiar with the frequently dismal quality of data even in supposedly trusted corporate source systems such as ERP. It is often only by looking at sources together than problems with the data show up. Often errors in, say, regional systems can cancel out or be obscured, and the true picture only emerges when data is added up at a collective level.

In these days of increasing anxiety about banking scandals and greater regulation, companies can ill afford to subject themselves to unnecessary risk by making decisions based on data of questionable quality. The issue is that most data warehouses today are based on inflexible approaches and designs, causing lengthy and costly delays in updating the warehouse when the business, inevitably, changes. It does not have to be this way. You can construct data warehouses in a more flexible manner, and in a way in which business users are engaged with the process. By running parallel master data management initiatives and setting up an MDM repository, the data warehouse can be relieved of at least some of the burden of sorting out corporate reference data, giving it more of chance.

It is incumbent of IT departments to embrace modern data warehouse technologies and complementary MDM best practice in order to avoid driving their customers to “skunk works” desperation in order to answer their needs. IT organisations that fail to do this risk being marginalised, but also indirectly drive up costs and risks for their companies.