Abstract: We consider the general k-way clustering problem in signed social networks where relationships between entities can be either positive or negative. Motivated by social balance theory, the clustering problem in signed networks aims to find mutually antagonistic groups such that entities within the same group are friends with each other. A recent method proposed in Kunegis et al. extended the spectral clustering algorithm to the signed network setting by considering the signed graph Laplacian. This has been shown to be equivalent to finding clusters that minimize the 2-way signed ratio cut. In this paper, we show that there is a fundamental weakness when we directly extend the signed Laplacian to the k-way clustering problem. To overcome this weakness, we formulate new k-way objectives for signed networks. In particular, we propose a criterion that is analogous to the normalized cut, called balance normalized cut, which is not only theoretically sound but also experimentally effective in k-way clustering. In addition, we prove that these objectives are equivalent to weighted kernel k-means objectives by choosing an appropriate kernel matrix. Employing this equivalence, we develop a multilevel clustering framework for signed networks. In this framework, we coarsen the graph level by level and refine the clustering results at each level via a k-means based algorithm so that the signed clustering objectives are optimized. This approach gives good quality clustering results, and is also highly efficient and scalable. In experiments, we see that our multilevel approach is competitive to other state-of-the-art methods, while it is much faster and more scalable. In particular, the largest graph we have considered in our experiments contains 1 million nodes and 100 million edges — this graph can be clustered in less than four hundred seconds using our algorithm.
Download: pdf, slides
Citation
- Scalable Clustering of Signed Networks using Balance Normalized Cut (pdf, slides, software)
K. Chiang, J. Whang, I. Dhillon.
In ACM Conference on Information and Knowledge Management (CIKM), pp. 615-624, October 2012. (Oral)
Bibtex: