A Divide-and-Conquer Method for Sparse Inverse Covariance Estimation

Abstract: We consider the composite log-determinant optimization problem, arising from the l1 regularized Gaussian maximum likelihood estimator of a sparse inverse covariance matrix, in a high-dimensional setting with a very large number of variables. Recent work has shown this estimator to have strong statistical guarantees in recovering the true structure of the sparse inverse covariance matrix, or alternatively the underlying graph structure of the corresponding Gaussian Markov Random Field, even in very high-dimensional regimes with a limited number of samples. In this paper, we are concerned with the computational cost in solving the above optimization problem. Our proposed algorithm partitions the problem into smaller sub-problems, and uses the solutions of the sub-problems to build a good approximation for the original problem. Our key idea for the divide step to obtain a sub-problem partition is as follows: we ﬁrst derive a tractable bound on the quality of the approximate solution obtained from solving the corresponding sub-divided problems. Based on this bound, we propose a clustering algorithm that attempts to minimize this bound, in order to ﬁnd effective partitions of the variables. For the conquer step, we use the approximate solution, i.e., solution resulting from solving the sub-problems, as an initial point to solve the original problem, and thereby achieve a much faster computational procedure.

Download: pdf

Citation

A Divide-and-Conquer Method for Sparse Inverse Covariance Estimation (pdf, software)
C. Hsieh, I. Dhillon, P. Ravikumar, A. Banerjee.
In Neural Information Processing Systems (NIPS), pp. 2339-2347, December 2012.

Bibtex:
@inproceedings{hsieh2012adividea, author = "Cho-Jui Hsieh AND Inderjit S. Dhillon AND Pradeep Ravikumar AND Arindam Banerjee", title = "A Divide-and-Conquer Method for Sparse Inverse Covariance Estimation", booktitle = "Neural Information Processing Systems (NIPS)", page = "2339–2347", year = "2012", month = "dec", abstract = "We consider the composite log-determinant optimization problem, arising from the l1 regularized Gaussian maximum likelihood estimator of a sparse inverse covariance matrix, in a high-dimensional setting with a very large number of variables. Recent work has shown this estimator to have strong statistical guarantees in recovering the true structure of the sparse inverse covariance matrix, or alternatively the underlying graph structure of the corresponding Gaussian Markov Random Field, even in very high-dimensional regimes with a limited number of samples. In this paper, we are concerned with the computational cost in solving the above optimization problem. Our proposed algorithm partitions the problem into smaller sub-problems, and uses the solutions of the sub-problems to build a good approximation for the original problem. Our key idea for the divide step to obtain a sub-problem partition is as follows: we ﬁrst derive a tractable bound on the quality of the approximate solution obtained from solving the corresponding sub-divided problems. Based on this bound, we propose a clustering algorithm that attempts to minimize this bound, in order to ﬁnd effective partitions of the variables. For the conquer step, we use the approximate solution, i.e., solution resulting from solving the sub-problems, as an initial point to solve the original problem, and thereby achieve a much faster computational procedure." }

Associated Projects

Large Scale Inverse Covariance Estimation

Center for Big Data Analytics

A Divide-and-Conquer Method for Sparse Inverse Covariance Estimation

Cho-Jui Hsieh, Inderjit Dhillon, Pradeep Ravikumar, Arindam Banerjee

Download: pdf

Citation

Associated Projects