A Comprehensive Survey on Community Detection with Deep Learning

125 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Fanzhen Liu

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Xing Su - Shan Xue - Fanzhen Liu

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

A community reveals the features and connections of its members that are different from those in other communities in a network. Detecting communities is of great significance in network analysis. Despite the classical spectral clustering and statistical inference methods, we notice a significant development of deep learning techniques for community detection in recent years with their advantages in handling high dimensional network data. Hence, a comprehensive overview of community detections latest progress through deep learning is timely to both academics and practitioners. This survey devises and proposes a new taxonomy covering different categories of the state-of-the-art methods, including deep learning-based models upon deep neural networks, deep nonnegative matrix factorization and deep sparse filtering. The main category, i.e., deep neural networks, is further divided into convolutional networks, graph attention networks, generative adversarial networks and autoencoders. The survey also summarizes the popular benchmark data sets, model evaluation metrics, and open-source implementations to address experimentation settings. We then discuss the practical applications of community detection in various domains and point to implementation scenarios. Finally, we outline future directions by suggesting challenging topics in this fast-growing deep learning field.

قيم البحث

222 - Di Jin , Zhizhi Yu , Pengfei Jiao 2021

Community detection, a fundamental task for network analysis, aims to partition a network into multiple sub-structures to help reveal their latent functions. Community detection has been extensively studied in and broadly applied to many real-world n etwork problems. Classical approaches to community detection typically utilize probabilistic graphical models and adopt a variety of prior knowledge to infer community structures. As the problems that network methods try to solve and the network data to be analyzed become increasingly more sophisticated, new approaches have also been proposed and developed, particularly those that utilize deep learning and convert networked data into low dimensional representation. Despite all the recent advancement, there is still a lack of insightful understanding of the theoretical and methodological underpinning of community detection, which will be critically important for future development of the area of network analysis. In this paper, we develop and present a unified architecture of network community-finding methods to characterize the state-of-the-art of the field of community detection. Specifically, we provide a comprehensive review of the existing community detection methods and introduce a new taxonomy that divides the existing methods into two categories, namely probabilistic graphical model and deep learning. We then discuss in detail the main idea behind each method in the two categories. Furthermore, to promote future development of community detection, we release several benchmark datasets from several problem domains and highlight their applications to various network analysis tasks. We conclude with discussions of the challenges of the field and suggestions of possible directions for future research.

الشبكات الاجتماعية والمعلومات الذكاء الاصطناعي التعلم الآلي

A Comprehensive Survey on Graph Anomaly Detection with Deep Learning

88 - Xiaoxiao Ma , Jia Wu , Shan Xue 2021

Anomalies represent rare observations (e.g., data records or events) that deviate significantly from others. Over several decades, the burst of information has attracted more attention on anomalies because of their significance in a wide range of dis ciplines Anomaly detection, which aims to identify rare observations, is among the most vital tasks in the world, and has shown its power in preventing detrimental events, such as financial fraud, network intrusion, and social spam. The detection task is typically solved by identifying outlying data points in the feature space and inherently overlooks the relational information in real-world data. Graphs have been prevalently used to represent the structural information, which raises the graph anomaly detection problem - identifying anomalous graph objects (i.e., nodes, edges and sub-graphs) in a single graph, or anomalous graphs in a database/set of graphs. However, conventional anomaly detection techniques cannot tackle this problem well because of the complexity of graph data. For the advent of deep learning, graph anomaly detection with deep learning has received a growing attention recently. In this survey, we aim to provide a systematic and comprehensive review of the contemporary deep learning techniques for graph anomaly detection. We compile open-sourced implementations, public datasets, and commonly-used evaluation metrics to provide affluent resources for future studies. More importantly, we highlight twelve extensive future research directions according to our survey results covering unsolved and emerging research problems and real-world applications. With this survey, our goal is to create a one-stop-shop that provides a unified understanding of the problem categories and existing approaches, publicly available hands-on resources, and high-impact open challenges for graph anomaly detection using deep learning.

التعلم الآلي

Zombie Account Detection Based on Community Detection and Uneven Assignation PageRank

274 - Qiu Yaowen , Li Yin , Lu Yanchang 2021

In the social media, there are a large amount of potential zombie accounts which may has negative impact on the public opinion. In tradition, PageRank algorithm is used to detect zombie accounts. However, problems such as it requires a large RAM to s tore adjacent matrix or adjacent list and the value of importance may approximately to zero for large graph exist. To solve the first problem, since the structure of social media makes the graph divisible, we conducted a community detection algorithm Louvain to decompose the whole graph into 1,002 subgraphs. The modularity of 0.58 shows the result is effective. To solve the second problem, we performed the uneven assignation PageRank algorithm to calculate the importance of node in each community. Then, a threshold is set to distinguish the zombie account and normal accounts. The result shows that about 20% accounts in the dataset are zombie accounts and they center in tier-one cities in China such as Beijing, Shanghai, and Guangzhou. In the future, a classification algorithm with semi-supervised learning can be used to detect zombie accounts.

الشبكات الاجتماعية والمعلومات الذكاء الاصطناعي الرياضيات المتقطعة

A Comprehensive Survey on Schema-based Event Extraction with Deep Learning

148 - Qian Li , Hao Peng , Jianxin Li 2021

Schema-based event extraction is a critical technique to apprehend the essential content of events promptly. With the rapid development of deep learning technology, event extraction technology based on deep learning has become a research hotspot. Num erous methods, datasets, and evaluation metrics have been proposed in the literature, raising the need for a comprehensive and updated survey. This paper fills the gap by reviewing the state-of-the-art approaches, focusing on deep learning-based models. We summarize the task definition, paradigm, and models of schema-based event extraction and then discuss each of these in detail. We introduce benchmark datasets that support tests of predictions and evaluation metrics. A comprehensive comparison between different techniques is also provided in this survey. Finally, we conclude by summarizing future research directions facing the research area.

الحساب واللغة

ComHapDet: A Spatial Community Detection Algorithm for Haplotype Assembly

421 - Abishek Sankararaman , Haris Vikalo , Franc{c}ois Baccelli 2019

Background: Haplotypes, the ordered lists of single nucleotide variations that distinguish chromosomal sequences from their homologous pairs, may reveal an individuals susceptibility to hereditary and complex diseases and affect how our bodies respon d to therapeutic drugs. Reconstructing haplotypes of an individual from short sequencing reads is an NP-hard problem that becomes even more challenging in the case of polyploids. While increasing lengths of sequencing reads and insert sizes {color{black} helps improve accuracy of reconstruction}, it also exacerbates computational complexity of the haplotype assembly task. This has motivated the pursuit of algorithmic frameworks capable of accurate yet efficient assembly of haplotypes from high-throughput sequencing data. Results: We propose a novel graphical representation of sequencing reads and pose the haplotype assembly problem as an instance of community detection on a spatial random graph. To this end, we construct a graph where each read is a node with an unknown community label associating the read with the haplotype it samples. Haplotype reconstruction can then be thought of as a two-step procedure: first, one recovers the community labels on the nodes (i.e., the reads), and then uses the estimated labels to assemble the haplotypes. Based on this observation, we propose ComHapDet - a novel assembly algorithm for diploid and ployploid haplotypes which allows both bialleleic and multi-allelic variants. Conclusions: Performance of the proposed algorithm is benchmarked on simulated as well as experimental data obtained by sequencing Chromosome $5$ of tetraploid biallelic emph{Solanum-Tuberosum} (Potato). The results demonstrate the efficacy of the proposed method and that it compares favorably with the existing techniques.

الشبكات الاجتماعية والمعلومات نظرية المعلومات التعلم الآلي