Supervised Link Prediction Using Multiple Sources

Zhengdong Lu, Berkant Savas, Wei Tang, Inderjit Dhillon

Abstract:   Link prediction is a fundamental problem in social network analysis and modern-day commercial applications such as Facebook and Myspace. Most existing research approaches this problem by exploring the topological structure of a social network using only one source of information. However, in many application domains, in addition to the social network of interest, there are a number of auxiliary social networks and/or derived proximity y networks available. The contribution of the paper is twofold: (1) a supervised learning framework that can effectively and efficiently learn the dynamics of social networks in the presence of auxiliary networks; (2) a feature design scheme for constructing a rich variety of path-based features using multiple sources, and an effective feature selection strategy based on structured sparsity. Extensive experiments on three real-world collaboration networks show that our model can effectively learn to predict new links using multiple sources, yielding higher prediction accuracy than unsupervised and single-source supervised models.

Download: pdf


  • Supervised Link Prediction Using Multiple Sources (pdf, software)
    Z. Lu, B. Savas, W. Tang, I. Dhillon.
    In IEEE International Conference on Data Mining (ICDM), pp. 923-928, December 2010.