Overlapping Community Detection Using Neighborhood-Inflated Seed Expansion

Joyce Whang, David Gleich, Inderjit Dhillon

Abstract:   Community detection is an important task in network analysis. A community (also referred to as a cluster) is a set of cohesive vertices that have more connections inside the set than outside. In many social and information networks, these communities naturally overlap. For instance, in a social network, each vertex in a graph corresponds to an individual who usually participates in multiple communities. In this paper, we propose an efficient overlapping community detection algorithm using a seed expansion approach. The key idea of our algorithm is to find good seeds, and then greedily expand these seeds based on a community metric. Within this seed expansion method, we investigate the problem of how to determine good seed nodes in a graph. In particular, we develop new seeding strategies for a personalized PageRank clustering scheme that optimizes the conductance community score. An important step in our method is the neighborhood inflation step where seeds are modified to represent their entire vertex neighborhood. Experimental results show that our seed expansion algorithm outperforms other state-of-the-art overlapping community detection methods in terms of producing cohesive clusters and identifying ground-truth communities. We also show that our new seeding strategies are better than existing strategies, and are thus effective in finding good overlapping communities in real-world networks.

Download: pdf

Citation

  • Overlapping Community Detection Using Neighborhood-Inflated Seed Expansion (pdf, software)
    J. Whang, D. Gleich, I. Dhillon.
    IEEE Transactions on Knowledge and Data Engineering (TKDE) 28(5), pp. 1272–1284, 2016.

    Bibtex: