When is an appliance not an appliance?

What we call things is important. The recent rise of data warehouse “appliances”, pioneered by Netezza (and arguably Teradata before that) is an interesting case in point. For years the relational database vendors spent their energy in making sure that transaction systems ran quickly and reliably. Business intelligence applications were not a major focus, and this led to a number of approaches to dealing with very large data warehouse applications. Certain types of index scheme would work very well for read-only BI queries, for example, and Red Brick was an early example of a database optimised as such. Later Teradata did a superb job of carving out a high end niche by using parallel processing hardware and specialist database software to take advantage of this properly. They did such a good job that after a while Teradata almost became synonymous with large data warehouses, of the types typically encountered in retail banks, supermarket chains, telcos etc. Oracle and othe others made some half-hearted attempts to fight back with features like star joins, but by then it was too late: the specialist data warehouse device, in the form of Teradata, had become established. Of course such projects were still large and complex. Most data warehouse project costs are associated with people, not hardware or software, and this does not change whether you are using SQL Server or Teradata as your database.

However, marketing can at times (not often, but sometimes) be a clever and subtle thing. When Netezza brought out essentially a device like Teradata, but quicker and cheaper, the label “appliance” was used, and a very clever one it is. In normal English usage an appliance is something that we just plug in, like a toaster or a coffee maker. Without making any such overt claims, the “appliance” label has a comforting implication that your data warehouse project will have that toaster installation-like quality previously lacking with pesky traditional databases. Given that a DW appliance is just some clever hardware and an optimised database, your project issues are in fact identical to those of any other DW project. Analysis, user requirements, data quality, sourcing, design and reporting all have to be done, although the appliance may certainly be able to handle large volumes of data at a much better price point than a traditional hardware/database combination. Since the hardware and software on a project may typically account for less than 20% of the project costs, this is an undeniably useful thing, but hardly takes us into toaster territory.

Yet the label matters. In a rather breathless blog yesterday:

http://www.itbusinessedge.com/blogs/mia/index.php/2006/09/05/flaming-web-20/

Mike Stevens, who I don’t know personally but appears to have a background in PR rather than hands-on data warehouse project implementation, claims that appliances spell “trouble for traditional data warehouse vendors” since an appliance may cost just USD 150k whereas “conventional solutions cost millions”. He falls into the language trap of the appliance. Your data warehouse still has to to deal with all those people-intensive things (data sourcing reporting, testing) whether you use a conventional SQL database and a regular server, or a specialist DW appliance. The issues are all identical, except with an appliance you have some additional cost since less familiar skills will need to be brought to bear (there are more Oracle skills out there than Netezza ones). The savings on hardware by using an appliance may be very significant and comfortably justified on a large data warehouse, but such a project is not going to cost USD 150k and a quick plug in the wall socket.

If this kind of misconception is so easily repeated by journalists (or at least bloggers) then I wonder how widespread this view is amongst IT managers, and how much this has helped data warehouse “appliances” catch on? Would Netezza have done quite so well if they had been labelled something less reassuring, like a “data warehouse turbo toolkit”? It was said that HP was so bad at marketing it would, if it sold sushi, describe it as “cold dead fish”. The “appliance” vendors shows that smart marketing can still be done within hi-tech.

5 thoughts on “When is an appliance not an appliance?”

  1. P.S. The savings mentioned in my previous post were purely development cost – nothing to do with the savings made on the cost of the hardware/software itself (which was also considerable).

  2. Andy – totally agree that an appliance is not a panacea for all your DW Project issues. However, in my experience, where they can save you both time and money during a DW implementation is physical schema design.

    Forget about indexes. Forget about tablespaces, fill factors, page sizes, buffer pools, clustering, etc. Certainly with Netezza you don’t need any of these things – the hardware is managed for you – which not only makes the system easier to build, it also costs less to maintain (no need for constant tweaking).

    Even better than that though, because these things are just so unbelievably fast, I’ve found there is less need to create (and populate, and maintain) layers of hierarchies of aggregated data – results that used to be needed to be ‘cached’ in such structures can typically be derived ‘on the fly’.

    As a result, and to put a figure on it, on my last project (not the same one Nigel Thomas was working on I hasten to add) this resulted in roughly 6X savings over using a conventional RDBMS for > 6X the performance.

  3. The ‘appliance’ tag is certainly good spin.

    Despite this it is going to be interesting all around in the warehousing world this year. Netezza, DATallegro and Kognitio all pushing some sort of ‘appliance’ message (although I think the latter can be licensed software only). Then there is Vertica Systems which, like Kognitio, is a columnar system which apparently will run on commodity hardware. On top of which HP wants in on Teradata’s game. Going to put a lot of pressure on Teradata at the very least.

    There’s some good technical stuff at this blog on warehousing appliances – http://www.dbms2.com/2007/01/27/data-warehouse-appliance-hardware-strategies/#more-135

  4. Thanks Nigel – very interesting. It is good to hear of some real life stories like this as it is usually hard to separate the marketing from the reality. Certainly at this very large scale it sounds like Netezza has a bright future.

  5. Andy, you linked to the wrong post (although that was pretty breathless too).

    The correct link is http://www.itbusinessedge.com/blogs/mia/index.php/2007/02/14/data-warehousings-darwinian-struggle/.

    Your warning points are well made. But the economic case for the Netezza model can still be made – rather easily. I’m aware of a POC where the capital costs of a 50TB Netezza solution are considerably less than the business is being charged back each year just for the cpus and discs supporting the equivalent Oracle DW. The clincher is that Netezza doesn’t require the same level of DBA expertise – it just hasn’t got any tuning knobs to fiddle with. Oh, and critical queries run up to 150 times faster…

    Regards Nigel

Comments are closed.