An Exact Mathematical Programming Approach to Multiple RNA Sequence-Structure Alignment

Markus Bauer, Gunnar W. Klau, Knut Reinert

Abstract


One of the main tasks in computational biology is the computation of alignments of genomic sequences to reveal their commonalities. In case of DNA or protein sequences, sequence information alone is usually sufficient to compute reliable alignments. RNA molecules, however, build spatial conformations, which can be represented by graph-like secondary structures. Often, secondary structures are more conserved than the actual sequence. Hence, computing reliable alignments of RNA molecules should take this additional information into account.

We present a novel framework for the computation of exact multiple sequence-structure alignments. We give a graph-theoretic representation of the sequence-structure alignment problem and phrase it as an integer linear program. We identify a class of constraints that make the problem easier to solve and relax the original integer linear program in a Lagrangian manner. We run experiments on data from the RFAM database and compare the performance of our model to heuristically inferred multiple structural alignments. Finally, experiments on a recently published benchmark show that in the pairwise case our algorithm achieves results comparable to those obtained by more costly dynamic programming algorithms while needing less computation time.

Full Text:

PDF


Algorithmic Operations Research. ISSN: 1718-3235