Geometric Aspects of Biological Sequence Comparison


Abstract in English

We propose a general framework for converting global and local similarities between biological sequences to quasi-metrics. In contrast to previous works, our formulation allows asymmetric distances, originating from uneven weighting of strings, that may induce non-trivial partial orders on sets of biosequences. Furthermore, the $ell^p$-type distances considered are more general than traditional generalized string edit distances corresponding to the $ell^1$ case, and enable conversion of sequence similarities to distances for a much wider class of scoring schemes. Our constructions require much less restrictive gap penalties than the ones regularly used. Numerous examples are provided to illustrate the concepts introduced and their potential applications.

Download