Sampling Graphlets of Multi-layer Networks: A Restricted Random Walk Approach

104 0 0.0 ( 0 )

Download Cite

Added by Yuedong Xu

Publication date 2020

fields Informatics Engineering

and research's language is English

Authors Simiao Jiao - Zihui Xue - Xiaowei Chen

Social and Information Networks

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Graphlets are induced subgraph patterns that are crucial to the understanding of the structure and function of a large network. A lot of efforts have been devoted to calculating graphlet statistics where random walk based approaches are commonly used to access restricted graphs through the available application programming interfaces (APIs). However, most of them merely consider individual networks while overlooking the strong coupling between different networks. In this paper, we estimate the graphlet concentration in multi-layer networks with real-world applications. An inter-layer edge connects two nodes in different layers if they belong to the same person. The access to a multi-layer network is restrictive in the sense that the upper layer allows random walk sampling, whereas the nodes of lower layers can be accessed only though the inter-layer edges and only support random node or edge sampling. To cope with this new challenge, we define a suit of two-layer graphlets and propose a novel random walk sampling algorithm to estimate the proportion of all the 3-node graphlets. An analytical bound on the sampling steps is proved to guarantee the convergence of our unbiased estimator. We further generalize our algorithm to explore the tradeoff between the estimated accuracies of different graphlets when the sample size is split on different layers. Experimental evaluation on real-world and synthetic multi-layer networks demonstrate the accuracy and high efficiency of our unbiased estimators.

rate research

Sampling Online Social Networks by Random Walk with Indirect Jumps

99 - Junzhou Zhao , Pinghui Wang , John C.S. Lui 2017

Random walk-based sampling methods are gaining popularity and importance in characterizing large networks. While powerful, they suffer from the slow mixing problem when the graph is loosely connected, which results in poor estimation accuracy. Random walk with jumps (RWwJ) can address the slow mixing problem but it is inapplicable if the graph does not support uniform vertex sampling (UNI). In this work, we develop methods that can efficiently sample a graph without the necessity of UNI but still enjoy the similar benefits as RWwJ. We observe that many graphs under study, called target graphs, do not exist in isolation. In many situations, a target graph is related to an auxiliary graph and a bipartite graph, and they together form a better connected {em two-layered network structure}. This new viewpoint brings extra benefits to graph sampling: if directly sampling a target graph is difficult, we can sample it indirectly with the assistance of the other two graphs. We propose a series of new graph sampling techniques by exploiting such a two-layered network structure to estimate target graph characteristics. Experiments conducted on both synthetic and real-world networks demonstrate the effectiveness and usefulness of these new techniques.

Social and Information Networks Physics and Society

Walk, Not Wait: Faster Sampling Over Online Social Networks

468 - Azade Nazi , Zhuojie Zhou , Saravanan Thirumuruganathan 2014

In this paper, we introduce a novel, general purpose, technique for faster sampling of nodes over an online social network. Specifically, unlike traditional random walk which wait for the convergence of sampling distribution to a predetermined target distribution - a waiting process that incurs a high query cost - we develop WALK-ESTIMATE, which starts with a much shorter random walk, and then proactively estimate the sampling probability for the node taken before using acceptance-rejection sampling to adjust the sampling probability to the predetermined target distribution. We present a novel backward random walk technique which provides provably unbiased estimations for the sampling probability, and demonstrate the superiority of WALK-ESTIMATE over traditional random walks through theoretical analysis and extensive experiments over real world online social networks.

Social and Information Networks Physics and Society

Estimating Properties of Social Networks via Random Walk considering Private Nodes

170 - Kazuki Nakajima , Kazuyuki Shudo 2020

Accurately analyzing graph properties of social networks is a challenging task because of access limitations to the graph data. To address this challenge, several algorithms to obtain unbiased estimates of properties from few samples via a random walk have been studied. However, existing algorithms do not consider private nodes who hide their neighbors in real social networks, leading to some practical problems. Here we design random walk-based algorithms to accurately estimate properties without any problems caused by private nodes. First, we design a random walk-based sampling algorithm that comprises the neighbor selection to obtain samples having the Markov property and the calculation of weights for each sample to correct the sampling bias. Further, for two graph property estimators, we propose the weighting methods to reduce not only the sampling bias but also estimation errors due to private nodes. The proposed algorithms improve the estimation accuracy of the existing algorithms by up to 92.6% on real-world datasets.

Social and Information Networks

Synwalk -- Community Detection via Random Walk Modelling

121 - Christian Toth , Denis Helic , Bernhard C. Geiger 2021

Complex systems, abstractly represented as networks, are ubiquitous in everyday life. Analyzing and understanding these systems requires, among others, tools for community detection. As no single best community detection algorithm can exist, robustness across a wide variety of problem settings is desirable. In this work, we present Synwalk, a random walk-based community detection method. Synwalk builds upon a solid theoretical basis and detects communities by synthesizing the random walk induced by the given network from a class of candidate random walks. We thoroughly validate the effectiveness of our approach on synthetic and empirical networks, respectively, and compare Synwalks performance with the performance of Infomap and Walktrap. Our results indicate that Synwalk performs robustly on networks with varying mixing parameters and degree distributions. We outperform Infomap on networks with high mixing parameter, and Infomap and Walktrap on networks with many small communities and low average degree. Our work has a potential to inspire further development of community detection via synthesis of random walks and we provide concrete ideas for future research.

Social and Information Networks Machine Learning Physics and Society

Louvain-like Methods for Community Detection in Multi-Layer Networks

44 - Sara Venturini , Andrea Cristofari , Francesco Rinaldi 2021

In many complex systems, entities interact with each other through complicated patterns that embed different relationships, thus generating networks with multiple levels and/or multiple types of edges. When trying to improve our understanding of those complex networks, it is of paramount importance to explicitly take the multiple layers of connectivity into account in the analysis. In this paper, we focus on detecting community structures in multi-layer networks, i.e., detecting groups of well-connected nodes shared among the layers, a very popular task that poses a lot of interesting questions and challenges. Most of the available algorithms in this context either reduce multi-layer networks to a single-layer network or try to extend algorithms for single-layer networks by using consensus clustering. Those approaches have anyway been criticized lately. They indeed ignore the connections among the different layers, hence giving low accuracy. To overcome these issues, we propose new community detection methods based on tailored Louvain-like strategies that simultaneously handle the multiple layers. We consider the informative case, where all layers show a community structure, and the noisy case, where some layers only add noise to the system. We report experiments on both artificial and real-world networks showing the effectiveness of the proposed strategies.

Social and Information Networks