ترغب بنشر مسار تعليمي؟ اضغط هنا

Directed degree corrected mixed membership model and estimating community memberships in directed networks

470   0   0.0 ( 0 )
 نشر من قبل Huan Qing
 تاريخ النشر 2021
والبحث باللغة English
 تأليف Huan Qing




اسأل ChatGPT حول البحث

This paper considers the problem of modeling and estimating community memberships of nodes in a directed network where every row (column) node is associated with a vector determining its membership in each row (column) community. To model such directed network, we propose directed degree corrected mixed membership (DiDCMM) model by considering degree heterogeneity. DiDCMM is identifiable under popular conditions for mixed membership network when considering degree heterogeneity. Based on the cone structure inherent in the normalized version of the left singular vectors and the simplex structure inherent in the right singular vectors of the population adjacency matrix, we build an efficient algorithm called DiMSC to infer the community membership vectors for both row nodes and column nodes. By taking the advantage of DiMSCs equivalence algorithm which returns same estimations as DiMSC and the recent development on row-wise singular vector deviation, we show that the proposed algorithm is asymptotically consistent under mild conditions by providing error bounds for the inferred membership vectors of each row node and each column node under DiDCMM. The theory is supplemented by a simulation study.



قيم البحث

اقرأ أيضاً

112 - Huan Qing , Jingli Wang 2020
Community detection in network analysis is an attractive research area recently. Here, under the degree-corrected mixed membership (DCMM) model, we propose an efficient approach called mixed regularized spectral clustering (Mixed-RSC for short) based on the regularized Laplacian matrix. Mixed-RSC is designed based on an ideal cone structure of the variant for the eigen-decomposition of the population regularized Laplacian matrix. We show that the algorithm is asymptotically consistent under mild conditions by providing error bounds for the inferred membership vector of each node. As a byproduct of our bound, we provide the theoretical optimal choice for the regularization parameter {tau}. To demonstrate the performance of our method, we apply it with previous benchmark methods on both simulated and real-world networks. To our knowledge, this is the first work to design spectral clustering algorithm for mixed membership community detection problem under DCMM model based on the application of regularized Laplacian matrix.
157 - Huan Qing , Jingli Wang 2020
Community detection has been well studied in network analysis, and one popular technique is spectral clustering which is fast and statistically analyzable for detect-ing clusters for given networks. But the more realistic case of mixed membership com munity detection remains a challenge. In this paper, we propose a new spectral clustering method Mixed-SLIM for mixed membership community detection. Mixed-SLIM is designed based on the symmetrized Laplacian inverse matrix (SLIM) (Jing et al. 2021) under the degree-corrected mixed membership (DCMM) model. We show that this algorithm and its regularized version Mixed-SLIM {tau} are asymptotically consistent under mild conditions. Meanwhile, we provide Mixed-SLIM appro and its regularized version Mixed-SLIM {tau}appro by approximating the SLIM matrix when dealing with large networks in practice. These four Mixed-SLIM methods outperform state-of-art methods in simulations and substantial empirical datasets for both community detection and mixed membership community detection problems.
136 - Huan Qing , Jingli Wang 2020
For community detection problem, spectral clustering is a widely used method for detecting clusters in networks. In this paper, we propose an improved spectral clustering (ISC) approach under the degree corrected stochastic block model (DCSBM). ISC i s designed based on the k-means clustering algorithm on the weighted leading K + 1 eigenvectors of a regularized Laplacian matrix where the weights are their corresponding eigenvalues. Theoretical analysis of ISC shows that under mild conditions the ISC yields stable consistent community detection. Numerical results show that ISC outperforms classical spectral clustering methods for community detection on both simulated and eight empirical networks. Especially, ISC provides a significant improvement on two weak signal networks Simmons and Caltech, with error rates of 121/1137 and 96/590, respectively.
335 - Huan Qing , Jingli Wang 2021
Mixed membership problem for undirected network has been well studied in network analysis recent years. However, the more general case of mixed membership for directed network remains a challenge. Here, we propose an interpretable model: bipartite mi xed membership stochastic blockmodel (BiMMSB for short) for directed mixed membership networks. BiMMSB allows that row nodes and column nodes of the adjacency matrix can be different and these nodes may have distinct community structure in a directed network. We also develop an efficient spectral algorithm called BiMPCA to estimate the mixed memberships for both row nodes and column nodes in a directed network. We show that the approach is asymptotically consistent under BiMMSB. We demonstrate the advantages of BiMMSB with applications to a small-scale simulation study, the directed Political blogs network and the Papers Citations network.
In a graph, a community may be loosely defined as a group of nodes that are more closely connected to one another than to the rest of the graph. While there are a variety of metrics that can be used to specify the quality of a given community, one co mmon theme is that flows tend to stay within communities. Hence, we expect cycles to play an important role in community detection. For undirected graphs, the importance of triangles -- an undirected 3-cycle -- has been known for a long time and can be used to improve community detection. In directed graphs, the situation is more nuanced. The smallest cycle is simply two nodes with a reciprocal connection, and using information about reciprocation has proven to improve community detection. Our new idea is based on the four types of directed triangles that contain cycles. To identify communities in directed networks, then, we propose an undirected edge-weighting scheme based on the type of the directed triangles in which edges are involved. We also propose a new metric on quality of the communities that is based on the number of 3-cycles that are split across communities. To demonstrate the impact of our new weighting, we use the standard METIS graph partitioning tool to determine communities and show experimentally that the resulting communities result in fewer 3-cycles being cut. The magnitude of the effect varies between a 10 and 50% reduction, and we also find evidence that this weighting scheme improves a task where plausible ground-truth communities are known.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا