Andy on Enterprise Software

A Lively Data Warehouse Appliance

February 15, 2008

DATAllegro was one of the earlier companies to market (2003) in the recent stampede of what I call ”fast databases”, which covers appliances and other approaches to speedy analytics (such as in-memory databases or column-oriented databases). Initially DATAllegro had its own hardware stack (like Netezza) but now uses a more open combination of storage from EMC and Dell Servers (with Cisco InfiniBand Interconnect). It runs on the well proven Ingres database, which has the advantage of being more “tuneable” than some other open databases like MySQL.

The database technology used means that plugging in business intelligence tools is easy, and the product is certified for the major BI tools such as Cognos and Business Objects, and recently Microstrategy. It can also work with Informatica and Ascential Datastage (now IBM) for ETL. Each fast database vendor has its own angle on why its technology is the best, but there are a couple of differentiators that DATAllegro has. One is that it does well in situations of mixed workloads, where as well as queries there are concurrent loads and even updates happening to the database. Another is its new “grid” technology, which allows customers to deal with the age-old compromise of centralised warehouse v decentralised data marts. Centralised is simplest to maintain but creates a bottleneck and creates scale challenges. However de-centralised marts quickly become un-co-ordinated and can lead to lack of business confidence in the data. The DATAllegro grid utilises node-to-node hardware transfer to allow dependent copies of data marts to be maintained from a central data warehouse. With transfer speeds of up to 1 TB a minute (!) claimed, such a deployment allows companies to have their cake and eat it. This technology is in use at one early customer site, and is just being released.

DATAllegro has set its sights firmly at the very high end of data volumes, those encountered by retailers and telcos. One large customer apparently has a live 470 TB database implementation, though since the company is very coy about naming its customers I cannot validate this. Still, this is enough data to give most DBAs sleepless nights, so it is fair to say that this is at the rarefied end of the data volume spectrum. This is territory firmly occupied by Teradata and Netezza (and to a lesser extent Greenplum). The company is tight-lipped about numbers of customers (and I can find only one named customer on its website), revenues and profitability, making it hard to know what market momentum is being achieved. However its technology seems to me to be based on solid foundations and has a large installed base of Teradata customers to attack. Interestingly, Oracle customers can be a harder sell, not because of the technology but because of the weight of stored procedures and triggers that customers have in Oracle’s proprietary extension to the SQL standard, making porting a major issue.

If only DATAllegro can encourage more customers to become public then it will be able to raise its profile further and avoid being painted as a niche vendor. Being secretive over customer and revenue numbers seems to me self-defeating, as it allows competitors to spread fear, uncertainty and doubt: sunlight is the best disinfectant, as Louis Brandeis so wisely said.