Beyond Low Rank Matrix Factorization

Beyond Low Rank Matrix Factorization

Project Summary

Matrix factorization (MF) approaches are incredibly popular in several machine learning areas, from collaborative filtering to computer vision. While low rank MF methods have been extensively studied both theoretically and algorithmically, often one has additional information about the problem at hand. For example, in recommender systems one might have access toa  social network of users, or a product co-purchasing graph. In other cases, one might be interested in making time-aware predictions, where temporal information plays a key role.  We aim to develop novel frameworks and large scale algorithms that incorporate additional information.

Project Description

Matrix factorization (MF) approaches are incredibly popular in several machine learning areas, from collaborative filtering to computer vision. While low rank MF methods have been extensively studied both theoretically and algorithmically, often one has additional information about the problem at hand. For example, in recommender systems one might have access to a social network of users, or a product co-purchasing graph. In other cases, one might be interested in making time-aware predictions, where temporal information plays a key role.  We aim to develop novel frameworks and large scale algorithms that incorporate additional information.

  1. Collaborative Filtering with Graphs :

    We tackle the problem of matrix completion when the variables are related to each other via a graph. We formulate and derive an efficient alternating minimization scheme that solves optimizations with over 15 million observations up to 2 orders of magnitude faster than SGD based methods. On the theoretical front, we show that such methods generalize weighted nuclear norm formulations, and derive statistical consistency guarantees. We validate our results on both real world and synthetic datasets. Furthermore, we extend this method to the PU learning setting, where the goal is to infer missing entries of a binary matrix from only positive and unlabeled examples.

    GRALS vs SGD and Gradient Descent. GRALS is up to 2 orders of magnitude faster than Stochastic Gradient Descent

    GRALS vs SGD and Gradient Descent. GRALS is up to 2 orders of magnitude faster than Stochastic Gradient Descent

 

 

 

Main Publications