Ploughing a new data furrow

As there was some interest in the last blog I thought it might be useful for people to know a little more about Exeros. The company was set up by an ex-founder of ACTA, and did a series A funding round in 2004. The company has some innovative technology which essentially reverse-engineers the structure of data by looking at the data values of database tables and files. This is different from some other profiling approaches, which often examine metadata e.g. column headings etc rather than data values. In this way it “discovers” business rules inherent in the data, and as a by-product then also discovers how well the data adheres to those rules. For example in one customer example they have, a customer gave Exeros a sample dataset and the product whirred away and discovered its structure. All well and good, but it also pointed out that in one case there was only a 98% match of the data to the structure, which caused the customer to say: “that’s impossible, that it is a mandatory field”. Well, perhaps, it was, but the data was still in error! In my own experience of MDM projects there are plenty of such moments; customers have an amusingly naive view of how good their data quality really is.

Other companies that purport to do data discovery are Sypherlink, and ahref=”http://www.zoomix.com/zoomix_wall.asp”>Zoomix , but Sypherlink in particular seems to use more conventional metadata-based profiling. The functionality that Exeros provides is useful for situations like master data integration projects, or data consolidation projects. It could also be used to help in building staging areas for ETL builds, where multiple sources of data often throw up all sorts of issues that have to be resolved manually. Exeros does not have a repository as such, and generates its analysis as output in either XML form or as feeds into tools like Business Object or ETL tools such as Informatica and IBM/Ascential.

The company started selling commercially in 2006, and already has a dozen or so customers in production. So far this has been mainly in the financial services area, who have plenty of data issues and stiff compliance reporting needs, but there is no reason why the technology should not be applied to any industry as far as I can see. There seems to have been some pretty serious R&D here, with a product team of 40 people, and the company seems to be to have kept to an admirably tight focus so far rather than trying to claim it solves the world’s problems on its own. Over time I would expect to see it having opportunities to partner with MDM vendors, especially those who take a generic MDM approach rather than, say, CDI only vendors. The broader the breadth of data, the more complex data issues emerge.

Marketing the company as “data discovery” rather than “data quality” is a good idea, as the approach is genuinely different, and avoids the company being pigeon-holed alongside more established companies. The drawback is that they essentially trying to carve out a new market, never an easy thing, and will encounter the usual emerging company issues with conservative buyers and analysts who prefer to neatly drop them into an existing slot. However in my view the problem they are tackling is very real, and the approach seems innovative, so they should continue on this path. If they make enough customers happy then the analysts will soon come around to their view.