Mixed Linear Regression with Multiple Components

Abstract: In this paper, we study the mixed linear regression (MLR) problem, where the goal is to recover multiple underlying linear models from their unlabeled linear measurements. We propose a non-convex objective function which we show is locally strongly convex in the neighborhood of the ground truth. We use a tensor method for initialization so that the initial models are in the local strong convexity region. We then employ general convex optimization algorithms to minimize the objective function. To the best of our knowledge, our approach provides first exact recovery guarantees for the MLR problem with K≥2 components. Moreover, our method has near-optimal computational complexity O(Nd) as well as near-optimal sample complexity O(d) for constant K. Furthermore, we show that our non-convex formulation can be extended to solving the subspace clustering problem as well. In particular, when initialized within a small constant distance to the true subspaces, our method converges to the global optima (and recovers true subspaces) in time linear in the number of points. Furthermore, our empirical results indicate that even with random initialization, our approach converges to the global optima in linear time, providing speed-up of up to two orders of magnitude.

Download: pdf

Citation

Mixed Linear Regression with Multiple Components (pdf, software)
K. Zhong, P. Jain, I. Dhillon.
In Neural Information Processing Systems (NIPS), December 2016.

Bibtex:
@inproceedings{zhong2016mixedline, author = "Kai Zhong AND Prateek Jain AND Inderjit S. Dhillon", title = "Mixed Linear Regression with Multiple Components", booktitle = "Neural Information Processing Systems (NIPS)", year = "2016", month = "dec", abstract = "In this paper, we study the mixed linear regression (MLR) problem, where the goal is to recover multiple underlying linear models from their unlabeled linear measurements. We propose a non-convex objective function which we show is locally strongly convex in the neighborhood of the ground truth. We use a tensor method for initialization so that the initial models are in the local strong convexity region. We then employ general convex optimization algorithms to minimize the objective function. To the best of our knowledge, our approach provides first exact recovery guarantees for the MLR problem with K≥2 components. Moreover, our method has near-optimal computational complexity O(Nd) as well as near-optimal sample complexity O(d) for constant K. Furthermore, we show that our non-convex formulation can be extended to solving the subspace clustering problem as well. In particular, when initialized within a small constant distance to the true subspaces, our method converges to the global optima (and recovers true subspaces) in time linear in the number of points. Furthermore, our empirical results indicate that even with random initialization, our approach converges to the global optima in linear time, providing speed-up of up to two orders of magnitude." }

Center for Big Data Analytics

Mixed Linear Regression with Multiple Components

Kai Zhong, Prateek Jain, Inderjit Dhillon

Download: pdf

Citation