بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Heuristic Algorithms for Best Match Graph Editing

302 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل David Schaller

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف David Schaller - Manuela Gei{ss} - Marc Hellmuth

التوافقية الرياضيات المتقطعة السكان والتطور

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Best match graphs (BMGs) are a class of colored digraphs that naturally appear in mathematical phylogenetics and can be approximated with the help of similarity measures between gene sequences, albeit not without errors. The corresponding graph editing problem can be used as a means of error correction. Since the arc set modification problems for BMGs are NP-complete, efficient heuristics are needed if BMGs are to be used for the practical analysis of biological sequence data. Since BMGs have a characterization in terms of consistency of a certain set of rooted triples, we consider heuristics that operate on triple sets. As an alternative, we show that there is a close connection to a set partitioning problem that leads to a class of top-down recursive algorithms that are similar to Ahos supertree algorithm and give rise to BMG editing algorithms that are consistent in the sense that they leave BMGs invariant. Extensive benchmarking shows that community detection algorithms for the partitioning steps perform best for BMG editing.

قيم البحث

175 - David Schaller , Peter F. Stadler , Marc Hellmuth 2020

Best match graphs (BMGs) are vertex-colored directed graphs that were introduced to model the relationships of genes (vertices) from different species (colors) given an underlying evolutionary tree that is assumed to be unknown. In real-life applicat ions, BMGs are estimated from sequence similarity data. Measurement noise and approximation errors usually result in empirically determined graphs that in general violate characteristic properties of BMGs. The arc modification problems for BMGs aim at correcting such violations and thus provide a means to improve the initial estimates of best match data. We show here that the arc deletion, arc completion and arc editing problems for BMGs are NP-complete and that they can be formulated and solved as integer linear programs. To this end, we provide a novel characterization of BMGs in terms of triples (binary trees on three leaves) and a characterization of BMGs with two colors in terms of forbidden subgraphs.

التعقيد الحسابي الرياضيات المتقطعة السكان والتطور

Arc-Completion of 2-Colored Best Match Graphs to Binary-Explainable Best Match Graphs

91 - David Schaller , Manuela Gei{ss} , Marc Hellmuth 2021

Best match graphs (BMGs) are vertex-colored digraphs that naturally arise in mathematical phylogenetics to formalize the notion of evolutionary closest genes w.r.t. an a priori unknown phylogenetic tree. BMGs are explained by unique least resolved tr ees. We prove that the property of a rooted, leaf-colored tree to be least resolved for some BMG is preserved by the contraction of inner edges. For the special case of two-colored BMGs, this leads to a characterization of the least resolved trees (LRTs) of binary-explainable trees and a simple, polynomial-time algorithm for the minimum cardinality completion of the arc set of a BMG to reach a BMG that can be explained by a binary tree.

بنى وهياكل البيانات والخوارزميات الرياضيات المتقطعة التوافقية

Complete Characterization of Incorrect Orthology Assignments in Best Match Graphs

99 - David Schaller , Manuela Gei{ss} , Peter F. Stadler 2020

Genome-scale orthology assignments are usually based on reciprocal best matches. In the absence of horizontal gene transfer (HGT), every pair of orthologs forms a reciprocal best match. Incorrect orthology assignments therefore are always false posit ives in the reciprocal best match graph. We consider duplication/loss scenarios and characterize unambiguous false-positive (u-fp) orthology assignments, that is, edges in the best match graphs (BMGs) that cannot correspond to orthologs for any gene tree that explains the BMG. Moreover, we provide a polynomial-time algorithm to identify all u-fp orthology assignments in a BMG. Simulations show that at least $75%$ of all incorrect orthology assignments can be detected in this manner. All results rely only on the structure of the BMGs and not on any a priori knowledge about underlying gene or species trees.

السكان والتطور الرياضيات المتقطعة بنى وهياكل البيانات والخوارزميات

Computing the blocks of a quasi-median graph

654 - Sven Herrmann , Vincent Moulton 2012

Quasi-median graphs are a tool commonly used by evolutionary biologists to visualise the evolution of molecular sequences. As with any graph, a quasi-median graph can contain cut vertices, that is, vertices whose removal disconnect the graph. These v ertices induce a decomposition of the graph into blocks, that is, maximal subgraphs which do not contain any cut vertices. Here we show that the special structure of quasi-median graphs can be used to compute their blocks without having to compute the whole graph. In particular we present an algorithm that, for a collection of $n$ aligned sequences of length $m$, can compute the blocks of the associated quasi-median graph together with the information required to correctly connect these blocks together in run time $mathcal O(n^2m^2)$, independent of the size of the sequence alphabet. Our primary motivation for presenting this algorithm is the fact that the quasi-median graph associated to a sequence alignment must contain all most parsimonious trees for the alignment, and therefore precomputing the blocks of the graph has the potential to help speed up any method for computing such trees.

التوافقية الرياضيات المتقطعة الأساليب الكمية

Combining Orthology and Xenology Data in a Common Phylogenetic Tree

169 - Marc Hellmuth , Mira Michel , Nikolai N. N{o}jgaard 2021

A rooted tree $T$ with vertex labels $t(v)$ and set-valued edge labels $lambda(e)$ defines maps $delta$ and $varepsilon$ on the pairs of leaves of $T$ by setting $delta(x,y)=q$ if the last common ancestor $text{lca}(x,y)$ of $x$ and $y$ is labeled $q $, and $min varepsilon(x,y)$ if $minlambda(e)$ for at least one edge $e$ along the path from $text{lca}(x,y)$ to $y$. We show that a pair of maps $(delta,varepsilon)$ derives from a tree $(T,t,lambda)$ if and only if there exists a common refinement of the (unique) least-resolved vertex labeled tree $(T_{delta},t_{delta})$ that explains $delta$ and the (unique) least resolved edge labeled tree $(T_{varepsilon},lambda_{varepsilon})$ that explains $varepsilon$ (provided both trees exist). This result remains true if certain combinations of labels at incident vertices and edges are forbidden.

التوافقية الرياضيات المتقطعة السكان والتطور

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الإتحاد الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Heuristic Algorithms for Best Match Graph Editing

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً