Complexity of modification problems for best match graphs


Abstract in English

Best match graphs (BMGs) are vertex-colored directed graphs that were introduced to model the relationships of genes (vertices) from different species (colors) given an underlying evolutionary tree that is assumed to be unknown. In real-life applications, BMGs are estimated from sequence similarity data. Measurement noise and approximation errors usually result in empirically determined graphs that in general violate characteristic properties of BMGs. The arc modification problems for BMGs aim at correcting such violations and thus provide a means to improve the initial estimates of best match data. We show here that the arc deletion, arc completion and arc editing problems for BMGs are NP-complete and that they can be formulated and solved as integer linear programs. To this end, we provide a novel characterization of BMGs in terms of triples (binary trees on three leaves) and a characterization of BMGs with two colors in terms of forbidden subgraphs.

Download