ترغب بنشر مسار تعليمي؟ اضغط هنا

Network Essence: PageRank Completion and Centrality-Conforming Markov Chains

118   0   0.0 ( 0 )
 نشر من قبل Shanghua Teng
 تاريخ النشر 2017
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English
 تأليف Shang-Hua Teng




اسأل ChatGPT حول البحث

Jiv{r}i Matouv{s}ek (1963-2015) had many breakthrough contributions in mathematics and algorithm design. His milestone results are not only profound but also elegant. By going beyond the original objects --- such as Euclidean spaces or linear programs --- Jirka found the essence of the challenging mathematical/algorithmic problems as well as beautiful solutions that were natural to him, but were surprising discoveries to the field. In this short exploration article, I will first share with readers my initial encounter with Jirka and discuss one of his fundamental geometric results from the early 1990s. In the age of social and information networks, I will then turn the discussion from geometric structures to network structures, attempting to take a humble step towards the holy grail of network science, that is to understand the network essence that underlies the observed sparse-and-multifaceted network data. I will discuss a simple result which summarizes some basic algebraic properties of personalized PageRank matrices. Unlike the traditional transitive closure of binary relations, the personalized PageRank matrices take accumulated Markovian closure of network data. Some of these algebraic properties are known in various contexts. But I hope featuring them together in a broader context will help to illustrate the desirable properties of this Markovian completion of networks, and motivate systematic developments of a network theory for understanding vast and ubiquitous multifaceted network data.



قيم البحث

اقرأ أيضاً

Given a graph $G$, a source node $s$ and a target node $t$, the personalized PageRank (PPR) of $t$ with respect to $s$ is the probability that a random walk starting from $s$ terminates at $t$. An important variant of the PPR query is single-source P PR (SSPPR), which enumerates all nodes in $G$, and returns the top-$k$ nodes with the highest PPR values with respect to a given source $s$. PPR in general and SSPPR in particular have important applications in web search and social networks, e.g., in Twitters Who-To-Follow recommendation service. However, PPR computation is known to be expensive on large graphs, and resistant to indexing. Consequently, previous solutions either use heuristics, which do not guarantee result quality, or rely on the strong computing power of modern data centers, which is costly. Motivated by this, we propose effective index-free and index-based algorithms for approximate PPR processing, with rigorous guarantees on result quality. We first present FORA, an approximate SSPPR solution that combines two existing methods Forward Push (which is fast but does not guarantee quality) and Monte Carlo Random Walk (accurate but slow) in a simple and yet non-trivial way, leading to both high accuracy and efficiency. Further, FORA includes a simple and effective indexing scheme, as well as a module for top-$k$ selection with high pruning power. Extensive experiments demonstrate that the proposed solutions are orders of magnitude more efficient than their respective competitors. Notably, on a billion-edge Twitter dataset, FORA answers a top-500 approximate SSPPR query within 1 second, using a single commodity server.
253 - Wei Chen , Shang-Hua Teng 2016
We study network centrality based on dynamic influence propagation models in social networks. To illustrate our integrated mathematical-algorithmic approach for understanding the fundamental interplay between dynamic influence processes and static ne twork structures, we focus on two basic centrality measures: (a) Single Node Influence (SNI) centrality, which measures each nodes significance by its influence spread; and (b) Shapley Centrality, which uses the Shapley value of the influence spread function --- formulated based on a fundamental cooperative-game-theoretical concept --- to measure the significance of nodes. We present a comprehensive comparative study of these two centrality measures. Mathematically, we present axiomatic characterizations, which precisely capture the essence of these two centrality measures and their fundamental differences. Algorithmically, we provide scalable algorithms for approximating them for a large family of social-influence instances. Empirically, we demonstrate their similarity and differences in a number of real-world social networks, as well as the efficiency of our scalable algorithms. Our results shed light on their applicability: SNI centrality is suitable for assessing individual influence in isolation while Shapley centrality assesses individuals performance in group influence settings.
Most network data are collected from partially observable networks with both missing nodes and missing edges, for example, due to limited resources and privacy settings specified by users on social media. Thus, it stands to reason that inferring the missing parts of the networks by performing network completion should precede downstream applications. However, despite this need, the recovery of missing nodes and edges in such incomplete networks is an insufficiently explored problem due to the modeling difficulty, which is much more challenging than link prediction that only infers missing edges. In this paper, we present DeepNC, a novel method for inferring the missing parts of a network based on a deep generative model of graphs. Specifically, our method first learns a likelihood over edges via an autoregressive generative model, and then identifies the graph that maximizes the learned likelihood conditioned on the observable graph topology. Moreover, we propose a computationally efficient DeepNC algorithm that consecutively finds individual nodes that maximize the probability in each node generation step, as well as an enhanced version using the expectation-maximization algorithm. The runtime complexities of both algorithms are shown to be almost linear in the number of nodes in the network. We empirically demonstrate the superiority of DeepNC over state-of-the-art network completion approaches.
We study the lobby index (l-index for short) as a local node centrality measure for complex networks. The l-inde is compared with degree (a local measure), betweenness and Eigenvector centralities (two global measures) in the case of biological netwo rk (Yeast interaction protein-protein network) and a linguistic network (Moby Thesaurus II). In both networks, the l-index has poor correlation with betweenness but correlates with degree and Eigenvector. Being a local measure, one can take advantage by using the l-index because it carries more information about its neighbors when compared with degree centrality, indeed it requires less time to compute when compared with Eigenvector centrality. Results suggests that l-index produces better results than degree and Eigenvector measures for ranking purposes, becoming suitable as a tool to perform this task.
In this paper, we present a framework for studying the following fundamental question in network analysis: How should one assess the centralities of nodes in an information/influence propagation process over a social network? Our framework systemat ically extends a family of classical graph-theoretical centrality formulations, including degree centrality, harmonic centrality, and their sphere-of-influence generalizations, to influence-based network centralities. We further extend natural group centralities from graph models to influence models, since group cooperation is essential in social influences. This in turn enables us to assess individuals centralities in group influence settings by applying the concept of Shapley value from cooperative game theory. Mathematically, using the property that these centrality formulations are Bayesian, we prove the following characterization theorem: Every influence-based centrality formulation in this family is the unique Bayesian centrality that conforms with its corresponding graph-theoretical centrality formulation. Moreover, the uniqueness is fully determined by the centrality formulation on the class of layered graphs, which is derived from a beautiful algebraic structure of influence instances modeled by cascading sequences. Our main mathematical result that layered graphs in fact form a basis for the space of influence-cascading-sequence profiles could also be useful in other studies of network influences. We further provide an algorithmic framework for efficient approximation of these influence-based centrality measures. Our study provides a systematic road map for comparative analyses of different influence-based centrality formulations, as well as for transferring graph-theoretical concepts to influence models.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا