Overlapping Community Detection Using Seed Set Expansion

Joyce Whang, David Gleich, Inderjit Dhillon

Abstract:   Community detection is an important task in network analysis. A community (also referred to as a cluster) is a set of cohesive vertices that have more connections inside the set than outside. In many social and information networks, these communities naturally overlap. For instance, in a social network, each vertex in a graph corresponds to an individual who usually participates in multiple communities. One of the most successful techniques for finding overlapping communities is based on local optimization and expansion of a community metric around a seed set of vertices. In this paper, we propose an efficient overlapping community detection algorithm using a seed set expansion approach. In particular, we develop new seeding strategies for a personalized PageRank scheme that optimizes the conductance community score. The key idea of our algorithm is to find good seeds, and then expand these seed sets using the personalized PageRank clustering procedure. Experimental results show that this seed set expansion approach outperforms other state-of-the-art overlapping community detection methods. We also show that our new seeding strategies are better than previous strategies, and are thus effective in finding good overlapping clusters in a graph.

Download: pdf, slides

Citation

  • Overlapping Community Detection Using Seed Set Expansion (pdf, slides, software)
    J. Whang, D. Gleich, I. Dhillon.
    In ACM Conference on Information and Knowledge Management (CIKM), pp. 2099-2108, October 2013. (Oral)

    Bibtex:

Associated Projects