Abstract: The L1-regularized Gaussian maximum likelihood estimator (MLE) has been shown to have strong statistical guarantees in recovering a sparse inverse covariance matrix even under high-dimensional settings. However, it requires solving a difficult non-smooth log-determinant program with number of parameters scaling quadratically with the number of Gaussian variables. State-of-the-art methods thus do not scale to problems with more than 20,000 variables. In this paper, we develop an algorithm BIGQUIC, which can solve 1 million dimensional L1- regularized Gaussian MLE problems (which would thus have 1000 billion parameters) using a single machine, with bounded memory. In order to do so, we carefully exploit the underlying structure of the problem. Our innovations include a novel block-coordinate descent method with the blocks chosen via a clustering scheme to minimize repeated computations; and allowing for inexact computation of specific components. In spite of these modifications, we are able to theoretically analyze our procedure and show that BIG & QUIC can achieve super-linear or even quadratic convergence rates.
- BIG & QUIC: Sparse Inverse Covariance Estimation for a Million Variables (pdf, software)
C. Hsieh, M. Sustik, I. Dhillon, P. Ravikumar, R. Poldrack.
In Neural Information Processing Systems (NIPS), December 2013. (Oral)