ﻻ يوجد ملخص باللغة العربية
Motivation: Optimizing seed selection is an important problem in read mapping. The number of non-overlapping seeds a mapper selects determines the sensitivity of the mapper while the total frequency of all selected seeds determines the speed of the mapper. Modern seed-and-extend mappers usually select seeds with either an equal and fixed-length scheme or with an inflexible placement scheme, both of which limit the potential of the mapper to select less frequent seeds to speed up the mapping process. Therefore, it is crucial to develop a new algorithm that can adjust both the individual seed length and the seed placement, as well as derive less frequent seeds. Results: We present the Optimal Seed Solver (OSS), a dynamic programming algorithm that discovers the least frequently-occurring set of x seeds in an L-bp read in $O(x times L)$ operations on average and in $O(x times L^{2})$ operations in the worst case. We compared OSS against four state-of-the-art seed selection schemes and observed that OSS provides a 3-fold reduction of average seed frequency over the best previous seed selection optimizations.
Motivation: Seed filtering is critical in DNA read mapping, a process where billions of DNA fragments (reads) sampled from a donor are mapped onto a reference genome to identify genomic variants of the donor. Read mappers 1) quickly generate possible
Motivation: Seed location filtering is critical in DNA read mapping, a process where billions of DNA fragments (reads) sampled from a donor are mapped onto a reference genome to identify genomic variants of the donor. State-of-the-art read mappers 1)
We consider a population constituted by two types of individuals; each of them can produce offspring in two different islands (as a particular case the islands can be interpreted as active or dormant individuals). We model the evolution of the popula
We investigate the behaviour of the genealogy of a Wright-Fisher population model under the influence of a strong seed-bank effect. More precisely, we consider a simple seed-bank age distribution with two atoms, leading to either classical or long ge
In this paper, we focus on the problem of stable prediction across unknown test data, where the test distribution is agnostic and might be totally different from the training one. In such a case, previous machine learning methods might exploit subtly