ترغب بنشر مسار تعليمي؟ اضغط هنا

Periodic power spectrum with applications in detection of latent periodicities in DNA sequences

196   0   0.0 ( 0 )
 نشر من قبل Changchuan Yin Dr.
 تاريخ النشر 2015
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Latent periodic elements in genomes play important roles in genomic functions. Many complex periodic elements in genomes are difficult to be detected by commonly used digital signal processing (DSP). We present a novel method to compute the periodic power spectrum of a DNA sequence based on the nucleotide distributions on periodic positions of the sequence. The method directly calculates full periodic spectrum of a DNA sequence rather than frequency spectrum by Fourier transform. The magnitude of the periodic power spectrum reflects the strength of the periodicity signals, thus, the algorithm can capture all the latent periodicities in DNA sequences. We apply this method on detection of latent periodicities in different genome elements, including exons and microsatellite DNA sequences. The results show that the method minimizes the impact of spectral leakage, captures a much broader latent periodicities in genomes, and outperforms the conventional Fourier transform.


قيم البحث

اقرأ أيضاً

146 - Changchuan Yin 2016
Repetitive elements are important in genomic structures, functions and regulations, yet effective methods in precisely identifying repetitive elements in DNA sequences are not fully accessible, and the relationship between repetitive elements and per iodicities of genomes is not clearly understood. We present an $textit{ab initio}$ method to quantitatively detect repetitive elements and infer the consensus repeat pattern in repetitive elements. The method uses the measure of the distribution uniformity of nucleotides at periodic positions in DNA sequences or genomes. It can identify periodicities, consensus repeat patterns, copy numbers and perfect levels of repetitive elements. The results of using the method on different DNA sequences and genomes demonstrate efficacy and accuracy in identifying repeat patterns and periodicities. The complexity of the method is linear with respect to the lengths of the analyzed sequences.
In modeling DNA chains, the number of alternations between Adenine-Thymine (AT) and Guanine-Cytosine (GC) base pairs can be considered as a measure of the heterogeneity of the chain, which in turn could affect its dynamics. A probability distribution function of the number of these alternations is derived for circular or periodic DNA. Since there are several symmetries to account for in the periodic chain, necklace counting methods are used. In particular, Polyas Enumeration Theorem is extended for the case of a group action that preserves partitioned necklaces. This, along with the treatment of generating functions as formal power series, allows for the direct calculation of the number of possible necklaces with a given number of AT base pairs, GC base pairs and alternations. The theoretically obtained probability distribution functions of the number of alternations are accurately reproduced by Monte Carlo simulations and fitted by Gaussians. The effect of the number of base pairs on the characteristics of these distributions is also discussed, as well as the effect of the ratios of the numbers of AT and GC base pairs.
301 - Niharika Pandala 2020
Recent events leading to the worldwide pandemic of COVID-19 have demonstrated the effective use of genomic sequencing technologies to establish the genetic sequence of this virus. In contrast, the COVID-19 pandemic has demonstrated the absence of com putational approaches to understand the molecular basis of this infection rapidly. Here we present an integrated approach to the study of the nsp1 protein in SARS-CoV-1, which plays an essential role in maintaining the expression of viral proteins and further disabling the host protein expression, also known as the host shutoff mechanism. We present three independent methods of evaluating two potential binding sites speculated to participate in host shutoff by nsp1. We have combined results from computed models of nsp1, with deep mining of all existing protein structures (using PDBMine), and binding site recognition (using msTALI) to examine the two sites consisting of residues 55-59 and 73-80. Based on our preliminary results, we conclude that the residues 73-80 appear as the regions that facilitate the critical initial steps in the function of nsp1. Given the 90% sequence identity between nsp1 from SARS-CoV-1 and SARS-CoV-2, we conjecture the same critical initiation step in the function of COVID-19 nsp1.
96 - H. Wang , R. Marsh , J.P. Lewis 2005
The question of whether DNA conducts electric charges is intriguing to physicists and biologists alike. The suggestion that electron transfer/transport in DNA might be biologically important has triggered a series of experimental and theoretical inve stigations. Here, we review recent theoretical progress by concentrating on quantum-chemical, molecular dynamics-based approaches to short DNA strands and physics-motivated tight-binding transport studies of long or even complete DNA sequences. In both cases, we observe small, but significant differences between specific DNA sequences such as periodic repetitions and aperiodic sequences of AT bases, lambda-DNA, centromeric DNA, promoter sequences as well as random-ATGC DNA.
Building a structure using self-assembly of DNA molecules by origami folding requires finding a route for the scaffolding strand through the desired structure. When the target structure is a 1-complex (or the geometric realization of a graph), an opt imal route corresponds to an Eulerian circuit through the graph with minimum turning cost. By showing that it leads to a solution to the 3-SAT problem, we prove that the general problem of finding an optimal route for a scaffolding strand for such structures is NP-hard. We then show that the problem may readily be transformed into a Traveling Salesman Problem (TSP), so that machinery that has been developed for the TSP may be applied to find optimal routes for the scaffolding strand in a DNA origami self-assembly process. We give results for a few special cases, showing for example that the problem remains intractable for graphs with maximum degree 8, but is polynomial time for 4-regular plane graphs if the circuit is restricted to following faces. We conclude with some implications of these results for related problems, such as biomolecular computing and mill routing problems.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا