ﻻ يوجد ملخص باللغة العربية
Efficient text indexing data structures have enabled large-scale genomic sequence analysis and are used to help solve problems ranging from assembly to read mapping. However, these data structures typically assume that the underlying reference text is static and will not change over the course of the queries being made. Some progress has been made in exploring how certain text indices, like the suffix array, may be updated, rather than rebuilt from scratch, when the underlying reference changes. Yet, these update operations can be complex in practice, difficult to implement, and give fairly pessimistic worst-case bounds. We present a novel data structure, SkipPatch, for maintaining a k-mer-based index over a dynamically changing genome. SkipPatch pairs a hash-based k-mer index with an indexable skip list that is used to efficiently maintain the set of edits that have been applied to the original genome. SkipPatch is practically fast, significantly outperforming the dynamic extended suffix array in terms of update and query speed.
To identify genetic changes underlying dog domestication and reconstruct their early evolutionary history, we analyzed novel high-quality genome sequences of three gray wolves, one from each of three putative centers of dog domestication, two ancient
Being able to store and transmit human genome sequences is an important part in genomic research and industrial applications. The complete human genome has 3.1 billion base pairs (haploid), and storing the entire genome naively takes about 3 GB, whic
Data on the number of Open Reading Frames (ORFs) coded by genomes from the 3 domains of Life show some notable general features including essential differences between the Prokaryotes and Eukaryotes, with the number of ORFs growing linearly with tota
Engineering the entire genome of an organism enables large-scale changes in organization, function, and external interactions, with significant implications for industry, medicine, and the environment. Improvements to DNA synthesis and organism engin
The problem of the directionality of genome evolution is studied from the information-theoretic view. We propose that the function-coding information quantity of a genome always grows in the course of evolution through sequence duplication, expansion