ترغب بنشر مسار تعليمي؟ اضغط هنا

ComHapDet: A Spatial Community Detection Algorithm for Haplotype Assembly

422   0   0.0 ( 0 )
 نشر من قبل Abishek Sankararaman
 تاريخ النشر 2019
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Background: Haplotypes, the ordered lists of single nucleotide variations that distinguish chromosomal sequences from their homologous pairs, may reveal an individuals susceptibility to hereditary and complex diseases and affect how our bodies respond to therapeutic drugs. Reconstructing haplotypes of an individual from short sequencing reads is an NP-hard problem that becomes even more challenging in the case of polyploids. While increasing lengths of sequencing reads and insert sizes {color{black} helps improve accuracy of reconstruction}, it also exacerbates computational complexity of the haplotype assembly task. This has motivated the pursuit of algorithmic frameworks capable of accurate yet efficient assembly of haplotypes from high-throughput sequencing data. Results: We propose a novel graphical representation of sequencing reads and pose the haplotype assembly problem as an instance of community detection on a spatial random graph. To this end, we construct a graph where each read is a node with an unknown community label associating the read with the haplotype it samples. Haplotype reconstruction can then be thought of as a two-step procedure: first, one recovers the community labels on the nodes (i.e., the reads), and then uses the estimated labels to assemble the haplotypes. Based on this observation, we propose ComHapDet - a novel assembly algorithm for diploid and ployploid haplotypes which allows both bialleleic and multi-allelic variants. Conclusions: Performance of the proposed algorithm is benchmarked on simulated as well as experimental data obtained by sequencing Chromosome $5$ of tetraploid biallelic emph{Solanum-Tuberosum} (Potato). The results demonstrate the efficacy of the proposed method and that it compares favorably with the existing techniques.



قيم البحث

اقرأ أيضاً

124 - Xing Su , Shan Xue , Fanzhen Liu 2021
A community reveals the features and connections of its members that are different from those in other communities in a network. Detecting communities is of great significance in network analysis. Despite the classical spectral clustering and statist ical inference methods, we notice a significant development of deep learning techniques for community detection in recent years with their advantages in handling high dimensional network data. Hence, a comprehensive overview of community detections latest progress through deep learning is timely to both academics and practitioners. This survey devises and proposes a new taxonomy covering different categories of the state-of-the-art methods, including deep learning-based models upon deep neural networks, deep nonnegative matrix factorization and deep sparse filtering. The main category, i.e., deep neural networks, is further divided into convolutional networks, graph attention networks, generative adversarial networks and autoencoders. The survey also summarizes the popular benchmark data sets, model evaluation metrics, and open-source implementations to address experimentation settings. We then discuss the practical applications of community detection in various domains and point to implementation scenarios. Finally, we outline future directions by suggesting challenging topics in this fast-growing deep learning field.
222 - Di Jin , Zhizhi Yu , Pengfei Jiao 2021
Community detection, a fundamental task for network analysis, aims to partition a network into multiple sub-structures to help reveal their latent functions. Community detection has been extensively studied in and broadly applied to many real-world n etwork problems. Classical approaches to community detection typically utilize probabilistic graphical models and adopt a variety of prior knowledge to infer community structures. As the problems that network methods try to solve and the network data to be analyzed become increasingly more sophisticated, new approaches have also been proposed and developed, particularly those that utilize deep learning and convert networked data into low dimensional representation. Despite all the recent advancement, there is still a lack of insightful understanding of the theoretical and methodological underpinning of community detection, which will be critically important for future development of the area of network analysis. In this paper, we develop and present a unified architecture of network community-finding methods to characterize the state-of-the-art of the field of community detection. Specifically, we provide a comprehensive review of the existing community detection methods and introduce a new taxonomy that divides the existing methods into two categories, namely probabilistic graphical model and deep learning. We then discuss in detail the main idea behind each method in the two categories. Furthermore, to promote future development of community detection, we release several benchmark datasets from several problem domains and highlight their applications to various network analysis tasks. We conclude with discussions of the challenges of the field and suggestions of possible directions for future research.
Detecting communities in large-scale networks is a challenging task when each vertex may belong to multiple communities, as is often the case in social networks. The multiple memberships of vertices and thus the strong overlaps among communities rend er many detection algorithms invalid. We develop a Partial Community Merger Algorithm (PCMA) for detecting communities with significant overlaps as well as slightly overlapping and disjoint ones. It is a bottom-up approach based on properly reassembling partial information of communities revealed in ego networks of vertices to reconstruct complete communities. Noise control and merger order are the two key issues in implementing this idea. We propose a novel similarity measure between two merged communities that can suppress noise and an efficient algorithm that recursively merges the most similar pair of communities. The validity and accuracy of PCMA is tested against two benchmarks and compared to four existing algorithms. It is the most efficient one with linear complexity and it outperforms the compared algorithms when vertices have multiple memberships. PCMA is applied to two huge online social networks, Friendster and Sina Weibo. Millions of communities are detected and they are of higher qualities than the corresponding metadata groups. We find that the latter should not be regarded as the ground-truth of structural communities. The significant overlapping pattern found in the detected communities confirms the need of new algorithms, such as PCMA, to handle multiple memberships of vertices in social networks.
Graphs representing real world systems may be studied from their underlying community structure. A community in a network is an intuitive idea for which there is no consensus on its objective mathematical definition. The most used metric in order to detect communities is the modularity, though many disadvantages of this parameter have already been noticed in the literature. In this work, we present a new approach based on a different metric: the surprise. Moreover, the biases of different community detection algorithms and benchmark networks are thoroughly studied, identified and commented about.
There has been a surge of interest in community detection in homogeneous single-relational networks which contain only one type of nodes and edges. However, many real-world systems are naturally described as heterogeneous multi-relational networks wh ich contain multiple types of nodes and edges. In this paper, we propose a new method for detecting communities in such networks. Our method is based on optimizing the composite modularity, which is a new modularity proposed for evaluating partitions of a heterogeneous multi-relational network into communities. Our method is parameter-free, scalable, and suitable for various networks with general structure. We demonstrate that it outperforms the state-of-the-art techniques in detecting pre-planted communities in synthetic networks. Applied to a real-world Digg network, it successfully detects meaningful communities.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا