Sparse Random Feature Algorithm as Coordinate Descent in Hilbert Space

Abstract: In this paper, we propose a Sparse Random Features algorithm, which learns a sparse non-linear predictor by minimizing an ℓ1-regularized objective function over the Hilbert Space induced from a kernel function. By interpreting the algorithm as Randomized Coordinate Descent in an infinite-dimensional space, we show the proposed approach converges to a solution within ϵ-precision of that using an exact kernel method, by drawing O(1/ϵ) random features, in contrast to the O(1/ϵ2) convergence achieved by current Monte-Carlo analyses of Random Features. In our experiments, the Sparse Random Feature algorithm obtains a sparse solution that requires less memory and prediction time, while maintaining comparable performance on regression and classification tasks. Moreover, as an approximate solver for the infinite-dimensional ℓ1-regularized problem, the randomized approach also enjoys better convergence guarantees than a Boosting approach in the setting where the greedy Boosting step cannot be performed exactly.

Download: pdf

Citation

Sparse Random Feature Algorithm as Coordinate Descent in Hilbert Space (pdf, software)
I. Yen, T. Lin, S. Lin, P. Ravikumar, I. Dhillon.
In Neural Information Processing Systems (NIPS), pp. 2456-2464, December 2014.

Bibtex:
@inproceedings{yen2014sparseran, author = "Ian En-Hsu Yen AND Ting-Wei Lin AND Shou-De Lin AND Pradeep Ravikumar AND Inderjit S. Dhillon", title = "Sparse Random Feature Algorithm as Coordinate Descent in Hilbert Space", booktitle = "Neural Information Processing Systems (NIPS)", page = "2456–2464", year = "2014", month = "dec", abstract = "In this paper, we propose a Sparse Random Features algorithm, which learns a sparse non-linear predictor by minimizing an ℓ1-regularized objective function over the Hilbert Space induced from a kernel function. By interpreting the algorithm as Randomized Coordinate Descent in an infinite-dimensional space, we show the proposed approach converges to a solution within ϵ-precision of that using an exact kernel method, by drawing O(1/ϵ) random features, in contrast to the O(1/ϵ2) convergence achieved by current Monte-Carlo analyses of Random Features. In our experiments, the Sparse Random Feature algorithm obtains a sparse solution that requires less memory and prediction time, while maintaining comparable performance on regression and classification tasks. Moreover, as an approximate solver for the infinite-dimensional ℓ1-regularized problem, the randomized approach also enjoys better convergence guarantees than a Boosting approach in the setting where the greedy Boosting step cannot be performed exactly." }

Center for Big Data Analytics

Sparse Random Feature Algorithm as Coordinate Descent in Hilbert Space

Ian En-Hsu Yen, Ting-Wei Lin, Shou-De Lin, Pradeep Ravikumar, Inderjit Dhillon

Download: pdf

Citation