Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Probabilistic Approaches to Alignment with Tandem Repeats

524 0 0.0 ( 0 )

Download Cite

Added by Aaron Darling

Publication date 2013

fields Biology

and research's language is English

Authors Michal Nanasi - Tomav{s} Vinav{r} -

Quantitative Methods Genomics

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We propose a simple tractable pair hidden Markov model for pairwise sequence alignment that accounts for the presence of short tandem repeats. Using the framework of gain functions, we design several optimization criteria for decoding this model and describe the resulting decoding algorithms, ranging from the traditional Viterbi and posterior decoding to block-based decoding algorithms specialized for our model. We compare the accuracy of individual decoding algorithms on simulated data and find our approach superior to the classical three-state pair HMM in simulations.

rate research

NTRFINDER: A Software Tool to Find Nested Tandem Repeats

368 - A. A. Matroud , M. D. Hendy , C. P. Tuffley 2010

We introduce the software tool NTRFinder to find the complex repetitive structure in DNA we call a nested tandem repeat (NTR). An NTR is a recurrence of two or more distinct tandem motifs interspersed with each other. We propose that nested tandem repeats can be used as phylogenetic and population markers. We have tested our algorithm on both real and simulated data, and present some real nested tandem repeats of interest. We discuss how the NTR found in the ribosomal DNA of taro (Colocasia esculenta) may assist in determining the cultivation prehistory of this ancient staple food crop. NTRFinder can be downloaded from http://www.maths.otago.ac.nz/? aamatroud/.

Quantitative Methods

Genetic Sequence Matching Using D4M Big Data Approaches

627 - Stephanie Dodson , Darrell O. Ricke , Jeremy Kepner 2014

Recent technological advances in Next Generation Sequencing tools have led to increasing speeds of DNA sample collection, preparation, and sequencing. One instrument can produce over 600 Gb of genetic sequence data in a single run. This creates new opportunities to efficiently handle the increasing workload. We propose a new method of fast genetic sequence analysis using the Dynamic Distributed Dimensional Data Model (D4M) - an associative array environment for MATLAB developed at MIT Lincoln Laboratory. Based on mathematical and statistical properties, the method leverages big data techniques and the implementation of an Apache Acculumo database to accelerate computations one-hundred fold over other methods. Comparisons of the D4M method with the current gold-standard for sequence analysis, BLAST, show the two are comparable in the alignments they find. This paper will present an overview of the D4M genetic sequence algorithm and statistical comparisons with BLAST.

Quantitative Methods Genomics

An Information-Theoretic Approach to Network Modularity

96 - Etay Ziv 2004

Exploiting recent developments in information theory, we propose, illustrate, and validate a principled information-theoretic algorithm for module discovery and resulting measure of network modularity. This measure is an order parameter (a dimensionless number between 0 and 1). Comparison is made to other approaches to module-discovery and to quantifying network modularity using Monte Carlo generated Erdos-like modular networks. Finally, the Network Information Bottleneck (NIB) algorithm is applied to a number of real world networks, including the social network of coauthors at the APS March Meeting 2004.

Quantitative Methods Genomics Molecular Networks

Ask2Me VarHarmonizer: A Python-Based Tool to Harmonize Variants from Cancer Genetic Testing Reports and Map them to the ClinVar Database

81 - Yuxi Liu , Kanhua Yin , Basanta Lamichhane 2019

PURPOSE: The popularity of germline genetic panel testing has led to a vast accumulation of variant-level data. Variant names are not always consistent across laboratories and not easily mappable to public variant databases such as ClinVar. A tool that can automate the process of variants harmonization and mapping is needed to help clinicians ensure their variant interpretations are accurate. METHODS: We present a Python-based tool, Ask2Me VarHarmonizer, that incorporates data cleaning, name harmonization, and a four-attempt mapping to ClinVar procedure. We applied this tool to map variants from a pilot dataset collected from 11 clinical practices. Mapping results were evaluated with and without the transcript information. RESULTS: Using Ask2Me VarHarmonizer, 4728 out of 6027 variant entries (78%) were successfully mapped to ClinVar, corresponding to 3699 mappable unique variants. With the addition of 1099 unique unmappable variants, a total of 4798 unique variants were eventually identified. 427 (9%) of these had multiple names, of which 343 (7%) had multiple names within-practice. 99% mapping consistency was observed with and without transcript information. CONCLUSION: Ask2Me VarHarmonizer aggregates and structures variant data, harmonizes names, and maps variants to ClinVar. Performing harmonization removes the ambiguity and redundancy of variants from different sources.

Quantitative Methods Genomics

iMet: A computational tool for structural annotation of unknown metabolites from tandem mass spectra

98 - Antoni Aguilar-Mogas 2016

Untargeted metabolomic studies are revealing large numbers of naturally occurring metabolites that cannot be characterized because their chemical structures and MS/MS spectra are not available in databases. Here we present iMet, a computational tool based on experimental tandem mass spectrometry that could potentially allow the annotation of metabolites not discovered previously. iMet uses MS/MS spectra to identify metabolites structurally similar to an unknown metabolite, and gives a net atomic addition or removal that converts the known metabolite into the unknown one. We validate the algorithm with 148 metabolites, and show that for 89% of them at least one of the top four matches identified by iMet enables the proper annotation of the unknown metabolite. iMet is freely available at http://imet.seeslab.net.

Quantitative Methods Molecular Networks

comments

Fetching comments

National Institute of Business Administration

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Probabilistic Approaches to Alignment with Tandem Repeats

Ask ChatGPT about the research

No Arabic abstract

Read More