On Learning Discrete Graphical Models using Group-Sparse Regularization

Abstract: We study the problem of learning the graph structure associated with a general discrete graphical models (each variable can take any of m > 1 values, the clique factors have maximum size c >= 2) from samples, under high-dimensional scaling where the number of variables p could be larger than the number of samples n. We provide a quantitative consistency analysis of a procedure based on node-wise multi-class logistic regression with group-sparse regularization. We first consider general m-ary pairwise models — where each factor depends on at most two variables. We show that when the number of samples scale as n > K(m-1)^2 d^2 log ((m-1)^2(p-1)) — where d is the maximum degree and K a fixed constant — the procedure succeeds in recovering the graph with high probability. For general models with c-way factors, the natural multi-way extension of the pairwise method quickly becomes very computationally complex. So we studied the effectiveness of using the pairwise method even while the true model has higher order factors. Surprisingly, we show that under slightly more stringent conditions, the pairwise procedure still recovers the graph structure, when the samples scale as n > K (m-1)^2 d^{3/2c – 1} log ( (m-1)^c (p-1)^{c-1} ).

Download: pdf

Citation

On Learning Discrete Graphical Models using Group-Sparse Regularization (pdf, software)
A. Jalali, P. Ravikumar, V. Vasuki, S. Sanghavi.
In International Conference on Artificial Intelligence and Statistics (AISTATS), April 2011.

Bibtex:
@inproceedings{jalali2011onlearnin, author = "Ali Jalali AND Pradeep Ravikumar AND Vishvas Vasuki AND Sujay Sanghavi", title = "On Learning Discrete Graphical Models using Group-Sparse Regularization", booktitle = "International Conference on Artificial Intelligence and Statistics (AISTATS)", year = "2011", month = "apr", abstract = "We study the problem of learning the graph structure associated with a general discrete graphical models (each variable can take any of m > 1 values, the clique factors have maximum size c >= 2) from samples, under high-dimensional scaling where the number of variables p could be larger than the number of samples n. We provide a quantitative consistency analysis of a procedure based on node-wise multi-class logistic regression with group-sparse regularization. We first consider general m-ary pairwise models — where each factor depends on at most two variables. We show that when the number of samples scale as n > K(m-1)^2 d^2 log ((m-1)^2(p-1)) — where d is the maximum degree and K a fixed constant — the procedure succeeds in recovering the graph with high probability. For general models with c-way factors, the natural multi-way extension of the pairwise method quickly becomes very computationally complex. So we studied the effectiveness of using the pairwise method even while the true model has higher order factors. Surprisingly, we show that under slightly more stringent conditions, the pairwise procedure still recovers the graph structure, when the samples scale as n > K (m-1)^2 d^{3/2c – 1} log ( (m-1)^c (p-1)^{c-1} )." }

Center for Big Data Analytics

On Learning Discrete Graphical Models using Group-Sparse Regularization

Ali Jalali, Pradeep Ravikumar, Vishvas Vasuki, Sujay Sanghavi

Download: pdf

Citation