Abstract: Multiple Sequence Alignment and Motif Discovery, known as NP-hard problems, are two fundamental tasks in Bioinformatics. Existing approaches to these two problems are based on either local search methods such as Expectation Maximization (EM), Gibbs Sampling or greedy heuristic methods. In this work, we develop a convex relaxation approach to both problems based on the recent concept of atomic norm and develop a new algorithm, termed Greedy Direction Method of Multiplier, for solving the convex relaxation with two convex atomic constraints. Experiments show that our convex relaxation approach produces solutions of higher quality than standard tools widely-used in Bioinformatics community on Multiple Sequence Alignment and Motif Discovery problems.
Download: pdf, slides, poster
Citation
- A Convex Atomic-Norm Approach to Multiple Sequence Alignment and Motif Discovery (pdf, slides, poster, software)
I. Yen, X. Lin, J. Zhang, P. Ravikumar, I. Dhillon.
In International Conference on Machine Learning (ICML), June 2016. (Oral)
Bibtex: