Abstract: In this paper, we propose a Sparse Random Features algorithm, which learns a sparse non-linear predictor by minimizing an ℓ1-regularized objective function over the Hilbert Space induced from a kernel function. By interpreting the algorithm as Randomized Coordinate Descent in an infinite-dimensional space, we show the proposed approach converges to a solution within ϵ-precision of that using an exact kernel method, by drawing O(1/ϵ) random features, in contrast to the O(1/ϵ2) convergence achieved by current Monte-Carlo analyses of Random Features. In our experiments, the Sparse Random Feature algorithm obtains a sparse solution that requires less memory and prediction time, while maintaining comparable performance on regression and classification tasks. Moreover, as an approximate solver for the infinite-dimensional ℓ1-regularized problem, the randomized approach also enjoys better convergence guarantees than a Boosting approach in the setting where the greedy Boosting step cannot be performed exactly.
- Sparse Random Feature Algorithm as Coordinate Descent in Hilbert Space (pdf)
I. Yen, T. Lin, S. Lin, P. Ravikumar, I. Dhillon.
In Neural Information Processing Systems (NIPS), pp. 2456-2464, December 2014.