Published: December 10, 2012
Abstract: [Plain Text Version]
Co-clustering is the simultaneous partitioning of the rows and columns of a matrix such that the blocks induced by the row/column partitions are good clusters. Motivated by several applications in text mining, market-basket analysis, and bioinformatics, this problem has attracted a lot of attention in the past few years. Unfortunately, to date, most of the algorithmic work on this problem has been heuristic in nature.
In this work we obtain the first approximation algorithms for the co-clustering problem. Our algorithms are simple and provide constant-factor approximations to the optimum. We also show that co-clustering is NP-hard, thereby complementing our algorithmic result.
A preliminary version of this paper appeared in the Proc. Proc. 27th ACM Symp. on Principles of Database Systems (PODS 2008).