Impact of regularization on spectral clustering under the mixed membership stochastic block model


Abstract in English

Mixed membership community detection is a challenge problem in network analysis. To estimate the memberships and study the impact of regularized spectral clustering under the mixed membership stochastic block (MMSB) model, this article proposes two efficient spectral clustering approaches based on regularized Laplacian matrix, Simplex Regularized Spectral Clustering (SRSC) and Cone Regularized Spectral Clustering (CRSC). SRSC and CRSC methods are designed based on the ideal simplex structure and the ideal cone structure in the variants of the eigen-decomposition of the population regularized Laplacian matrix. We show that these two approaches SRSC and CRSC are asymptotically consistent under mild conditions by providing error bounds for the inferred membership vector of each node under MMSB. Through the theoretical analysis, we give the upper and lower bound for the regularizer $tau$. By introducing a parametric convergence probability, we can directly see that when $tau$ is large these two methods may still have low error rates but with a smaller probability. Thus we give an empirical optimal choice of $tau$ is $O(log(n))$ with $n$ the number of nodes to detect sparse networks. The proposed two approaches are successfully applied to synthetic and empirical networks with encouraging results compared with some benchmark methods.

Download