There is an interesting web forum which seeks to bring an open source approach to the world of data management. Of interest are topics involving the creation of open source de-duplication, profiling, matching and cleansing tools (hat tip to CW for pointing this out).
No doubt the tools here are at an early stage and won’t directly compare in broad functionality with a major data quality vendor. However, for many people with less sophisticated requirements that may not matter. The rise of products like MySQL has shown how influential an open source product can become given the right circumstances.
I would be very interested as to whether any readers of the blog have any experience with the tools here, or any views on the merits or otherwise of an open approach to data quality and data integration.