ترغب بنشر مسار تعليمي؟ اضغط هنا

Bipartite mixed membership stochastic blockmodel

336   0   0.0 ( 0 )
 نشر من قبل Huan Qing
 تاريخ النشر 2021
والبحث باللغة English




اسأل ChatGPT حول البحث

Mixed membership problem for undirected network has been well studied in network analysis recent years. However, the more general case of mixed membership for directed network remains a challenge. Here, we propose an interpretable model: bipartite mixed membership stochastic blockmodel (BiMMSB for short) for directed mixed membership networks. BiMMSB allows that row nodes and column nodes of the adjacency matrix can be different and these nodes may have distinct community structure in a directed network. We also develop an efficient spectral algorithm called BiMPCA to estimate the mixed memberships for both row nodes and column nodes in a directed network. We show that the approach is asymptotically consistent under BiMMSB. We demonstrate the advantages of BiMMSB with applications to a small-scale simulation study, the directed Political blogs network and the Papers Citations network.



قيم البحث

اقرأ أيضاً

136 - Huan Qing , Jingli Wang 2020
For community detection problem, spectral clustering is a widely used method for detecting clusters in networks. In this paper, we propose an improved spectral clustering (ISC) approach under the degree corrected stochastic block model (DCSBM). ISC i s designed based on the k-means clustering algorithm on the weighted leading K + 1 eigenvectors of a regularized Laplacian matrix where the weights are their corresponding eigenvalues. Theoretical analysis of ISC shows that under mild conditions the ISC yields stable consistent community detection. Numerical results show that ISC outperforms classical spectral clustering methods for community detection on both simulated and eight empirical networks. Especially, ISC provides a significant improvement on two weak signal networks Simmons and Caltech, with error rates of 121/1137 and 96/590, respectively.
In bipartite networks, community structures are restricted to being disassortative, in that nodes of one type are grouped according to common patterns of connection with nodes of the other type. This makes the stochastic block model (SBM), a highly f lexible generative model for networks with block structure, an intuitive choice for bipartite community detection. However, typical formulations of the SBM do not make use of the special structure of bipartite networks. Here we introduce a Bayesian nonparametric formulation of the SBM and a corresponding algorithm to efficiently find communities in bipartite networks which parsimoniously chooses the number of communities. The biSBM improves community detection results over general SBMs when data are noisy, improves the model resolution limit by a factor of $sqrt{2}$, and expands our understanding of the complicated optimization landscape associated with community detection tasks. A direct comparison of certain terms of the prior distributions in the biSBM and a related high-resolution hierarchical SBM also reveals a counterintuitive regime of community detection problems, populated by smaller and sparser networks, where nonhierarchical models outperform their more flexible counterpart.
469 - Huan Qing 2021
This paper considers the problem of modeling and estimating community memberships of nodes in a directed network where every row (column) node is associated with a vector determining its membership in each row (column) community. To model such direct ed network, we propose directed degree corrected mixed membership (DiDCMM) model by considering degree heterogeneity. DiDCMM is identifiable under popular conditions for mixed membership network when considering degree heterogeneity. Based on the cone structure inherent in the normalized version of the left singular vectors and the simplex structure inherent in the right singular vectors of the population adjacency matrix, we build an efficient algorithm called DiMSC to infer the community membership vectors for both row nodes and column nodes. By taking the advantage of DiMSCs equivalence algorithm which returns same estimations as DiMSC and the recent development on row-wise singular vector deviation, we show that the proposed algorithm is asymptotically consistent under mild conditions by providing error bounds for the inferred membership vectors of each row node and each column node under DiDCMM. The theory is supplemented by a simulation study.
Much of the data being created on the web contains interactions between users and items. Stochastic blockmodels, and other methods for community detection and clustering of bipartite graphs, can infer latent user communities and latent item clusters from this interaction data. These methods, however, typically ignore the items contents and the information they provide about item clusters, despite the tendency of items in the same latent cluster to share commonalities in content. We introduce content-augmented stochastic blockmodels (CASB), which use item content together with user-item interaction data to enhance the user communities and item clusters learned. Comparisons to several state-of-the-art benchmark methods, on datasets arising from scientists interacting with scientific articles, show that content-augmented stochastic blockmodels provide highly accurate clusters with respect to metrics representative of the underlying community structure.
With ever-increasing amounts of online information available, modeling and predicting individual preferences-for books or articles, for example-is becoming more and more important. Good predictions enable us to improve advice to users, and obtain a b etter understanding of the socio-psychological processes that determine those preferences. We have developed a collaborative filtering model, with an associated scalable algorithm, that makes accurate predictions of individuals preferences. Our approach is based on the explicit assumption that there are groups of individuals and of items, and that the preferences of an individual for an item are determined only by their group memberships. Importantly, we allow each individual and each item to belong simultaneously to mixtures of different groups and, unlike many popular approaches, such as matrix factorization, we do not assume implicitly or explicitly that individuals in each group prefer items in a single group of items. The resulting overlapping groups and the predicted preferences can be inferred with a expectation-maximization algorithm whose running time scales linearly (per iteration). Our approach enables us to predict individual preferences in large datasets, and is considerably more accurate than the current algorithms for such large datasets.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا