Abstract: Co-clustering, or simultaneous clustering of rows and columns of a two-dimensional data matrix,
is rapidly becoming a powerful data analysis technique. Co-clustering has enjoyed wide success in
varied application domains such as text clustering, gene-microarray analysis, natural language processing
and image, speech and video analysis. In this paper, we introduce a partitional co-clustering
formulation that is driven by the search for a good matrix approximation—every co-clustering is
associated with an approximation of the original data matrix and the quality of co-clustering is
determined by the approximation error. We allow the approximation error to be measured using
a large class of loss functions called Bregman divergences that include squared Euclidean distance
and KL-divergence as special cases. In addition, we permit multiple structurally different
co-clustering schemes that preserve various linear statistics of the original data matrix. To accomplish
the above tasks, we introduce a new minimum Bregman information (MBI) principle that
simultaneously generalizes the maximum entropy and standard least squares principles, and leads
to a matrix approximation that is optimal among all generalized additive models in a certain natural
parameter space. Analysis based on this principle yields an elegant meta algorithm, special cases
of which include most previously known alternate minimization based clustering algorithms such
as kmeans and co-clustering algorithms such as information theoretic (Dhillon et al., 2003b) and
minimum sum-squared residue co-clustering (Cho et al., 2004). To demonstrate the generality and
flexibility of our co-clustering framework, we provide examples and empirical evidence on a variety of problem domains and also describe novel co-clustering applications such as missing value
prediction and compression of categorical data matrices.
Download: pdf
- A Generalized Maximum Entropy Approach to Bregman Co-clustering and Matrix Approximations (pdf, software)
A. Banerjee, I. Dhillon, J. Ghosh, S. Merugu, D. Modha.
Journal of Machine Learning Research (JMLR) 8, pp. 1919-1986, August 2007.