Clustered Low Rank Approximation of Graphs in Information Science Applications

Abstract: In this paper we present a fast and accurate procedure called clustered low rank matrix approximation for massive graphs. The procedure involves a fast clustering of the graph and then approximates each cluster separately using existing methods, e.g., the singular value decomposition, or stochastic algorithms. The cluster-wise approximation are then extended to approximate the entire graph. This approach has several benets: (1) important community structure of the graph is preserved due to the clustering; (2) highly accurate low rank approximations are achieved; (3) the procedure is ecient both in terms of computational speed and memory usage; (4) better performance in problems from various applications compared to standard low rank approximation. Further, we generalize stochastic algorithms to the clustered low rank approximation framework and present theoretical bounds for the approximation error. Finally, a set of experiments, using large scale and real-world graphs, show that our methods outperform standard low rank matrix approximation algorithms.

Download: pdf

Citation

Clustered Low Rank Approximation of Graphs in Information Science Applications (pdf, software)
B. Savas, I. Dhillon.
In SIAM International Conference on Data Mining (SDM), pp. 164-175, April 2011.

Bibtex:
@inproceedings{savas2011clustered, author = "Berkant Savas AND Inderjit S. Dhillon", title = "Clustered Low Rank Approximation of Graphs in Information Science Applications", booktitle = "SIAM International Conference on Data Mining (SDM)", page = "164–175", year = "2011", month = "apr", abstract = "In this paper we present a fast and accurate procedure called clustered low rank matrix approximation for massive graphs. The procedure involves a fast clustering of the graph and then approximates each cluster separately using existing methods, e.g., the singular value decomposition, or stochastic algorithms. The cluster-wise approximation are then extended to approximate the entire graph. This approach has several bene ts: (1) important community structure of the graph is preserved due to the clustering; (2) highly accurate low rank approximations are achieved; (3) the procedure is ecient both in terms of computational speed and memory usage; (4) better performance in problems from various applications compared to standard low rank approximation. Further, we generalize stochastic algorithms to the clustered low rank approximation framework and present theoretical bounds for the approximation error. Finally, a set of experiments, using large scale and real-world graphs, show that our methods outperform standard low rank matrix approximation algorithms." }

Center for Big Data Analytics

Clustered Low Rank Approximation of Graphs in Information Science Applications

Berkant Savas, Inderjit Dhillon

Download: pdf

Citation