ﻻ يوجد ملخص باللغة العربية
Metabarcoding on amplicons is rapidly expanding as a method to produce molecular based inventories of microbial communities. Here, we work on freshwater diatoms, which are microalgae possibly inventoried both on a morphological and a molecular basis. We have developed an algorithm, in a program called diagno-syst, based a the notion of informative read, which carries out supervised clustering of reads by mapping them exactly one by one on all reads of a well curated and taxonomically annotated reference database. This program has been run on a HPC (and HTC) infrastructure to address computation load. We compare optical and molecular based inventories on 10 samples from Leman lake, and 30 from Swedish rivers. We track all possibilities of mismatches between both approaches, and compare the results with standard pipelines (with heuristics) like Mothur. We find that the comparison with optics is more accurate when using exact calculations, at the price of a heavier computation load. It is crucial when studying the long tail of biodiversity, which may be overestimated by pipelines or algorithms using heuristics instead (more false positive). This work supports the analysis that these methods will benefit from progress in, first, building an agreement between molecular based and morphological based systematics and, second, having as complete as possible publicly available reference databases.
The drive for reproducibility in the computational sciences has provoked discussion and effort across a broad range of perspectives: technological, legislative/policy, education, and publishing. Discussion on these topics is not new, but the need to
Aggregating transcriptomics data across hospitals can increase sensitivity and robustness of differential expression analyses, yielding deeper clinical insights. As data exchange is often restricted by privacy legislation, meta-analyses are frequentl
Proteins are the active working horses in our body. These biomolecules perform all vital cellular functions from DNA replication and general biosynthesis to metabolic signaling and environmental sensing. While static 3D structures are now readily ava
Summary: More sophisticated models are needed to address problems in bioscience, synthetic biology, and precision medicine. To help facilitate the collaboration needed for such models, the community developed the Simulation Experiment Description Mar
We introduce the software tool NTRFinder to find the complex repetitive structure in DNA we call a nested tandem repeat (NTR). An NTR is a recurrence of two or more distinct tandem motifs interspersed with each other. We propose that nested tandem re