Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

The Parallelism Motifs of Genomic Data Analysis

336 0 0.0 ( 0 )

Download Cite

Added by Aydin Buluc

Publication date 2020

fields Informatics Engineering Biology

and research's language is English

Authors Katherine Yelick - Aydin Buluc - Muaaz Awan

Distributed Parallel and Cluster Computing Genomics

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Genomic data sets are growing dramatically as the cost of sequencing continues to decline and small sequencing devices become available. Enormous community databases store and share this data with the research community, but some of these genomic data analysis problems require large scale computational platforms to meet both the memory and computational requirements. These applications differ from scientific simulations that dominate the workload on high end parallel systems today and place different requirements on programming support, software libraries, and parallel architectural design. For example, they involve irregular communication patterns such as asynchronous updates to shared data structures. We consider several problems in high performance genomics analysis, including alignment, profiling, clustering, and assembly for both single genomes and metagenomes. We identify some of the common computational patterns or motifs that help inform parallelization strategies and compare our motifs to some of the established lists, arguing that at least two key patterns, sorting and hashing, are missing.

rate research

TRONCO: an R package for the inference of cancer progression models from heterogeneous genomic data

140 - Luca De Sano , Giulio Caravagna , Daniele Ramazzotti 2015

Motivation: We introduce TRONCO (TRanslational ONCOlogy), an open-source R package that implements the state-of-the-art algorithms for the inference of cancer progression models from (epi)genomic mutational profiles. TRONCO can be used to extract population-level models describing the trends of accumulation of alterations in a cohort of cross-sectional samples, e.g., retrieved from publicly available databases, and individual-level models that reveal the clonal evolutionary history in single cancer patients, when multiple samples, e.g., multiple biopsies or single-cell sequencing data, are available. The resulting models can provide key hints in uncovering the evolutionary trajectories of cancer, especially for precision medicine or personalized therapy. Availability: TRONCO is released under the GPL license, it is hosted in the Software section at http://bimib.disco.unimib.it/ and archived also at bioconductor.org. Contact: [email protected]

Quantitative Methods Genomics Applications

The tectonic cause of mass extinctions and the genomic contribution to biodiversification

587 - Dirson Jian Li 2012

Despite numerous mass extinctions in the Phanerozoic eon, the overall trend in biodiversity evolution was not blocked and the life has never been wiped out. Almost all possible catastrophic events (large igneous province, asteroid impact, climate change, regression and transgression, anoxia, acidification, sudden release of methane clathrate, multi-cause etc.) have been proposed to explain the mass extinctions. However, we should, above all, clarify at what timescale and at what possible levels should we explain the mass extinction? Even though the mass extinctions occurred at short-timescale and at the species level, we reveal that their cause should be explained in a broader context at tectonic timescale and at both the molecular level and the species level. The main result in this paper is that the Phanerozoic biodiversity evolution has been explained by reconstructing the Sepkoski curve based on climatic, eustatic and genomic data. Consequently, we point out that the P-Tr extinction was caused by the tectonically originated climate instability. We also clarify that the overall trend of biodiversification originated from the underlying genome size evolution, and that the fluctuation of biodiversity originated from the interactions among the earths spheres. The evolution at molecular level had played a significant role for the survival of life from environmental disasters.

Populations and Evolution Genomics Quantitative Methods

Memory Matching Networks for Genomic Sequence Classification

120 - Jack Lanchantin , Ritambhara Singh , Yanjun Qi 2017

When analyzing the genome, researchers have discovered that proteins bind to DNA based on certain patterns of the DNA sequence known as motifs. However, it is difficult to manually construct motifs due to their complexity. Recently, externally learned memory models have proven to be effective methods for reasoning over inputs and supporting sets. In this work, we present memory matching networks (MMN) for classifying DNA sequences as protein binding sites. Our model learns a memory bank of encoded motifs, which are dynamic memory modules, and then matches a new test sequence to each of the motifs to classify the sequence as a binding or nonbinding site.

Machine Learning Genomics Machine Learning

Comparative Analysis of Packages and Algorithms for the Analysis of Spatially Resolved Transcriptomics Data

106 - Natalie Charitakis , Mirana Ramialison (1 , 2 2021

The technology to generate Spatially Resolved Transcriptomics (SRT) data is rapidly being improved and applied to investigate a variety of biological tissues. The ability to interrogate how spatially localised gene expression can lend new insight to different tissue development is critical, but the appropriate tools to analyse this data are still emerging. This chapter reviews available packages and pipelines for the analysis of different SRT datasets with a focus on identifying spatially variable genes (SVGs) alongside other aims, while discussing the importance of and challenges in establishing a standardised ground truth in the biological data for benchmarking.

Quantitative Methods Genomics

Excess of genomic defects in a woolly mammoth on Wrangel island

122 - Rebekah L. Rogers , Montgomery Slatkin 2016

Woolly mammoths (Mammuthus primigenius) populated Siberia, Beringia, and North America during the Pleistocene and early Holocene. Recent breakthroughs in ancient DNA sequencing have allowed for complete genome sequencing for two specimens of woolly mammoths (Palkopoulou et al. 2015). One mammoth specimen is from a mainland population ~45,000 years ago when mammoths were plentiful. The second, a 4300 yr old specimen, is derived from an isolated population on Wrangel island where mammoths subsisted with small effective population size more than 43-fold lower than previous populations. These extreme differences in effective population size offer a rare opportunity to test nearly neutral models of genome architecture evolution within a single species. Using these previously published mammoth sequences, we identify deletions, retrogenes, and non-functionalizing point mutations. In the Wrangel island mammoth, we identify a greater number of deletions, a larger proportion of deletions affecting gene sequences, a greater number of candidate retrogenes, and an increased number of premature stop codons. This accumulation of detrimental mutations is consistent with genomic meltdown in response to low effective population sizes in the dwindling mammoth population on Wrangel island. In addition, we observe high rates of loss of olfactory receptors and urinary proteins, either because these loci are non-essential or because they were favored by divergent selective pressures in island environments. Finally, at the locus of FOXQ1 we observe two independent loss-of-function mutations, which would confer a satin coat phenotype in this island woolly mammoth.

Populations and Evolution Genomics

comments

Fetching comments

Aِl-Baath University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

The Parallelism Motifs of Genomic Data Analysis

Ask ChatGPT about the research

No Arabic abstract

Read More