Robust Training in High Dimensions via Block Coordinate Geometric Median Descent

Abstract: Geometric median (GM) is a classical method in statistics for achieving robust estimation of the uncorrupted data; under gross corruption, it achieves the optimal breakdown point of 1/2. However, its computational complexity makes it infeasible for robustifying stochastic gradient descent (SGD) in high-dimensional optimization problems. In this paper, we show that by applying GM to only a judiciously chosen block of coordinates at a time and using a memory mechanism, one can retain the breakdown point of 1/2 for smooth non-convex problems, with non-asymptotic convergence rates comparable to the SGD with GM while resulting in significant speedup in training. We further validate the run-time and robustness of our approach empirically on several popular deep learning tasks. Code available at: https://github.com/anishacharya/BGMD

Download: arXiv version, slides, poster, code

Citation

Robust Training in High Dimensions via Block Coordinate Geometric Median Descent (arXiv, slides, poster, software, code)
A. Acharya, A. Hashemi, P. Jain, S. Sanghavi, I. Dhillon, U. Topcu.
In International Conference on Artificial Intelligence and Statistics (AISTATS), March 2022.

Bibtex:
@inproceedings{acharya2022robusttra, author = "Anish Acharya AND Abolfazl Hashemi AND Prateek Jain AND Sujay Sanghavi AND Inderjit S. Dhillon AND Ufuk Topcu", title = "Robust Training in High Dimensions via Block Coordinate Geometric Median Descent", booktitle = "International Conference on Artificial Intelligence and Statistics (AISTATS)", year = "2022", month = "mar", abstract = "Geometric median (GM) is a classical method in statistics for achieving robust estimation of the uncorrupted data; under gross corruption, it achieves the optimal breakdown point of 1/2. However, its computational complexity makes it infeasible for robustifying stochastic gradient descent (SGD) in high-dimensional optimization problems. In this paper, we show that by applying GM to only a judiciously chosen block of coordinates at a time and using a memory mechanism, one can retain the breakdown point of 1/2 for smooth non-convex problems, with non-asymptotic convergence rates comparable to the SGD with GM while resulting in significant speedup in training. We further validate the run-time and robustness of our approach empirically on several popular deep learning tasks. Code available at: https://github.com/anishacharya/BGMD" }

Center for Big Data Analytics

Robust Training in High Dimensions via Block Coordinate Geometric Median Descent

Anish Acharya, Abolfazl Hashemi, Prateek Jain, Sujay Sanghavi, Inderjit Dhillon, Ufuk Topcu

Download: arXiv version, slides, poster, code

Citation