Darwin and data warehouse projects

Sreedhar Srikant writes about the importance of the logical data model in a data warehouse project in DM Review. This well-written article describes the process of building a model, highlights five pitfalls and suggests some ways of avoiding them.  It is in this last area that I feel the article could be enhanced.  In my experience there are two serious dangers that a data warehouse project faces that go beyond issues of project problems like trouble agreeing on a model.  These are:

(a)   The project gets insufficient business buy-in through lack of a well-articulated and robust business case

(b)   The project takes too long to deliver, making it vulnerable to budget cutbacks since it has not shown tangible benefit early enough.

I am constantly surprised how often IT projects in major corporations seem to get off the ground without a strong business case.  IT projects compete for capital in a company with many other project proposals, and so it can be a Darwinian process when times get tough: projects with the strongest business case and sponsorship will survive.  As a minimum, the project needs to set out the expected returns that it will make, set against the project costs, over a three year (sometimes five year) period.  A simple example is shown below:

Costs:  $3M one-off, $2.16M annual

Benefits: $5M from year 2 onwards, with $2M only in year 1.

In this instance the project costs $3M to deliver and just over $2M to support each year (a Data Warehouse Institute survey showed that the average data warehouse costs 72% of its build costs to support every year).  Against this are some benefits as shown.  In this instance the project is tolerably attractive, since it has a positive net present value (USD $536k using a typical 18% discount rate) and a decent 27% IRR, though its payback period is a little slow.  However, while not stellar, it is respectable as cases go, and it is at least written in the language of business. 

What might the project benefits be?   These will vary from project to project and from industry to industry, but examples might include either profit-enhancing benefits, such as reduced customer churn or improved pricing ability, or cost reductions such as fewer misplaced deliveries due to improved data quality, or better procurement margins to due to improved understanding of supplier spend.  In order to articulate these you need to find a business sponsor, preferably one who has a problem related to poor information.  Trust me; you should not have to look too far in a big company for one of these. 

Having a business case that is properly set out will act as a safety net when project reviews happen, and reduce the chances of a project being cancelled when the knives come out. 

The second thing that can help your project is to deliver something tangible early.  Traditional waterfall methodologies, often used by large systems integrators, are not always well suited to data warehouse projects, where requirements are often rather loose.  The average data warehouse project takes 16 months to deliver according to TDWI, and that is a long time in this turbulent world when management has its budgets adjusted and people are looking for projects to cut.  If your project can deliver something meaningful quickly i.e. a piece of the overall problem, then your project sponsor has a lot better chance of defending the project.  If all the review committee can see is costs then things will be harder.  Many real projects I have been involved with have been killed in this way.

One way to improve your odds of delivering something quickly is to use a data warehouse package, where at least some of the functionality is already pre-built for you.  Packages may or may not be cheaper than custom-build, but they should be quicker.  If you can pick off a chunk of the project and deliver reports back to the sponsor that add value early on then the project is much more likely to survive than one that is still delivering a grand enterprise logical data model. These days there are several packaged or semi-packaged alternatives to custom build.  A good overview of the packaged data warehouse market was done this year by Bloor and can be downloaded for free here.  

By developing a robust business case and by delivering benefits iteratively your project greatly improves its chances of survival.  When the budget sharks come circling, it is nice to have a life raft.