Dirty Statistical Models

Eunho Yang, Pradeep Ravikumar

Abstract:   We consider the problem of recovering the graph structure of a “hub-networked” Ising model given iid samples, under high-dimensional settings, where number of nodes p could be potentially larger than the number of samples n. By a “hub-networked” graph, we mean a graph with a few “hub nodes” with very large degrees. State of the art estimators for Ising models have a sample complexity that scales polynomially with the maximum node-degree, and are thus ill-suited to recovering such graphs with a few hub nodes. Some recent proposals for specifically recovering hub graphical models do not come with theoretical guarantees, and even empirically provide limited improvements over vanilla Ising model estimators. Here, we show that under such low sample settings, instead of estimating “difficult” components such as hub-neighborhoods, we can use quantitative indicators of our inability to do so, and thereby identify hub-nodes. This simple procedure allows us to recover hub-networked graphs with very strong statistical guarantees even under very low sample settings.

Download: pdf

Citation

  • Dirty Statistical Models (pdf, software)
    E. Yang, P. Ravikumar.
    In Neural Information Processing Systems (NIPS), 2013.

    Bibtex: