A Convex Atomic-Norm Approach to Multiple Sequence Alignment and Motif Discovery

Ian En-Hsu Yen, Xin Lin, Jiong Zhang, Pradeep Ravikumar, Inderjit Dhillon

Abstract:   Multiple Sequence Alignment and Motif Discovery, known as NP-hard problems, are two fundamental tasks in Bioinformatics. Existing approaches to these two problems are based on either local search methods such as Expectation Maximization (EM), Gibbs Sampling or greedy heuristic methods. In this work, we develop a convex relaxation approach to both problems based on the recent concept of atomic norm and develop a new algorithm, termed Greedy Direction Method of Multiplier, for solving the convex relaxation with two convex atomic constraints. Experiments show that our convex relaxation approach produces solutions of higher quality than standard tools widely-used in Bioinformatics community on Multiple Sequence Alignment and Motif Discovery problems.

Download: pdf, slides, poster

Citation

  • A Convex Atomic-Norm Approach to Multiple Sequence Alignment and Motif Discovery (pdf, slides, poster, software)
    I. Yen, X. Lin, J. Zhang, P. Ravikumar, I. Dhillon.
    In International Conference on Machine Learning (ICML), June 2016. (Oral)

    Bibtex: