ترغب بنشر مسار تعليمي؟ اضغط هنا

Indirect Identification of Horizontal Gene Transfer

78   0   0.0 ( 0 )
 نشر من قبل David Schaller
 تاريخ النشر 2020
والبحث باللغة English




اسأل ChatGPT حول البحث

Several implicit methods to infer Horizontal Gene Transfer (HGT) focus on pairs of genes that have diverged only after the divergence of the two species in which the genes reside. This situation defines the edge set of a graph, the later-divergence-time (LDT) graph, whose vertices correspond to genes colored by their species. We investigate these graphs in the setting of relaxed scenarios, i.e., evolutionary scenarios that encompass all commonly used variants of duplication-transfer-loss scenarios in the literature. We characterize LDT graphs as a subclass of properly vertex-colored cographs, and provide a polynomial-time recognition algorithm as well as an algorithm to construct a relaxed scenario that explains a given LDT. An edge in an LDT graph implies that the two corresponding genes are separated by at least one HGT event. The converse is not true, however. We show that the complete xenology relation is described by an rs-Fitch graph, i.e., a complete multipartite graph satisfying constraints on the vertex coloring. This class of vertex-colored graphs is also recognizable in polynomial time. We finally address the question how much information about all HGT events is contained in LDT graphs with the help of simulations of evolutionary scenarios with a wide range of duplication, loss, and HGT events. In particular, we show that a simple greedy graph editing scheme can be used to efficiently detect HGT events that are implicitly contained in LDT graphs.



قيم البحث

اقرأ أيضاً

This paper develops a mathematical model describing the influence that conjugation-mediated Horizontal Gene Transfer (HGT) has on the mutation-selection balance in an asexually reproducing population of unicellular, prokaryotic organisms. It is assum ed that mutation-selection balance is reached in the presence of a fixed background concentration of antibiotic, to which the population must become resistant in order to survive. We analyze the behavior of the model in the limit of low and high antibiotic-induced first-order death rate constants, and find that the highest mean fitness is obtained at low rates of bacterial conjugation. As the rate of conjugation crosses a threshold, the mean fitness decreases to a minimum, and then rises asymptotically to a limiting value as the rate of conjugation becomes infinitely large. However, this limiting value is smaller than the mean fitness obtained in the limit of low conjugation rate. This dependence of the mean fitness on the conjugation rate is fairly small for the parameter ranges we have considered, and disappears as the first-order death rate constant due to the presence of antibiotic approaches zero. For large values of the antibiotic death rate constant, we have obtained an analytical solution for the behavior of the mean fitness that agrees well with the results of simulations. The results of this paper suggest that conjugation-mediated HGT has a slightly deleterious effect on the mean fitness of a population at mutation-selection balance. Therefore, we argue that HGT confers a selective advantage by allowing for faster adaptation to a new or changing environment. The results of this paper are consistent with the observation that HGT can be promoted by environmental stresses on a population.
Computational inference of dated evolutionary histories relies upon various hypotheses about RNA, DNA, and protein sequence mutation rates. Using mutation rates to infer these dated histories is referred to as molecular clock assumption. Coalescent t heory is a popular class of evolutionary models that implements the molecular clock hypothesis to facilitate computational inference of dated phylogenies. Cancer and virus evolution are two areas where these methods are particularly important. Methodologically, phylogenetic inference methods require a tree space over which the inference is performed, and geometry of this space plays an important role in statistical and computational aspects of tree inference algorithms. It has recently been shown that molecular clock, and hence coalescent, trees possess a unique geometry, different from that of classical phylogenetic tree spaces which do not model mutation rates. Here we introduce and study a space of discrete coalescent trees, that is, we assume that time is discrete, which is inevitable in many computational formalisations. We establish several geometrical properties of the space and show how these properties impact various algorithms used in phylogenetic analyses. Our tree space is a discretisation of a known time tree space, called t-space, and hence our results can be used to approximate solutions to various open problems in t-space. Our tree space is also a generalisation of another known trees space, called the ranked nearest neighbour interchange space, hence our advances in this paper imply new and generalise existing results about ranked trees.
Genome-scale orthology assignments are usually based on reciprocal best matches. In the absence of horizontal gene transfer (HGT), every pair of orthologs forms a reciprocal best match. Incorrect orthology assignments therefore are always false posit ives in the reciprocal best match graph. We consider duplication/loss scenarios and characterize unambiguous false-positive (u-fp) orthology assignments, that is, edges in the best match graphs (BMGs) that cannot correspond to orthologs for any gene tree that explains the BMG. Moreover, we provide a polynomial-time algorithm to identify all u-fp orthology assignments in a BMG. Simulations show that at least $75%$ of all incorrect orthology assignments can be detected in this manner. All results rely only on the structure of the BMGs and not on any a priori knowledge about underlying gene or species trees.
BACKGROUND: The uncoupling protein (UCP) genes belong to the superfamily of electron transport carriers of the mitochondrial inner membrane. Members of the uncoupling protein family are involved in thermogenesis and determining the functional evoluti on of UCP genes is important to understand the evolution of thermo-regulation in vertebrates. RESULTS: Sequence similarity searches of genome and scaffold data identified homologues of UCP in eutherians, teleosts and the first squamates uncoupling proteins. Phylogenetic analysis was used to characterize the family evolutionary history by identifying two duplications early in vertebrate evolution and two losses in the avian lineage (excluding duplications within a species, excluding the losses due to incompletely sequenced taxa and excluding the losses and duplications inferred through mismatch of species and gene trees). Estimates of synonymous and nonsynonymous substitution rates (dN/dS) and more complex branch and site models suggest that the duplication events were not associated with positive Darwinian selection and that the UCP is constrained by strong purifying selection except for a single site which has undergone positive Darwinian selection, demonstrating that the UCP gene family must be highly conserved. CONCLUSION: We present a phylogeny describing the evolutionary history of the UCP gene family and show that the genes have evolved through duplications followed by purifying selection except for a single site in the mitochondrial matrix between the 5th and 6th alpha-helices which has undergone positive selection.
We propose a general mechanism for evolution to explain the diversity of gene and language. To quantify their common features and reveal the hidden structures, several statistical properties and patterns are examined based on a new method called the rank-rank analysis. We find that the classical correspondence, domain plays the role of word in gene language, is not rigorous, and propose to replace domain by protein. In addition, we devise a new evolution unit, syllgram, to include the characteristics of spoken and written language. Based on the correspondence between (protein, domain) and (word, syllgram), we discover that both gene and language shared a common scaling structure and scale-free network. Like the Rosetta stone, this work may help decipher the secret behind non-coding DNA and unknown languages.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا