No Arabic abstract
Recent progress towards unraveling the hidden geometric organization of real multiplexes revealed significant correlations across the hyperbolic node coordinates in different network layers, which facilitated applications like trans-layer link prediction and mutual navigation. But are geometric correlations alone sufficient to explain the topological relation between the layers of real systems? Here we provide the negative answer to this question. We show that connections in real systems tend to persist from one layer to another irrespectively of their hyperbolic distances. This suggests that in addition to purely geometric aspects the explicit link formation process in one layer impacts the topology of other layers. Based on this finding, we present a simple modification to the recently developed Geometric Multiplex Model to account for this effect, and show that the extended model can reproduce the behavior observed in real systems. We also find that link persistence is significant in all considered multiplexes and can explain their layers high edge overlap, which cannot be explained by coordinate correlations alone. Furthermore, by taking both link persistence and hyperbolic distance correlations into account we can improve trans-layer link prediction. These findings guide the development of multiplex embedding methods, suggesting that such methods should be accounting for both coordinate correlations and link persistence across layers.
As a fundamental challenge in vast disciplines, link prediction aims to identify potential links in a network based on the incomplete observed information, which has broad applications ranging from uncovering missing protein-protein interaction to predicting the evolution of networks. One of the most influential methods rely on similarity indices characterized by the common neighbors or its variations. We construct a hidden space mapping a network into Euclidean space based solely on the connection structures of a network. Compared with real geographical locations of nodes, our reconstructed locations are in conformity with those real ones. The distances between nodes in our hidden space could serve as a novel similarity metric in link prediction. In addition, we hybrid our hidden space method with other state-of-the-art similarity methods which substantially outperforms the existing methods on the prediction accuracy. Hence, our hidden space reconstruction model provides a fresh perspective to understand the network structure, which in particular casts a new light on link prediction.
Across many scientific domains, there is a common need to automatically extract a simplified view or coarse-graining of how a complex systems components interact. This general task is called community detection in networks and is analogous to searching for clusters in independent vector data. It is common to evaluate the performance of community detection algorithms by their ability to find so-called ground truth communities. This works well in synthetic networks with planted communities because such networks links are formed explicitly based on those known communities. However, there are no planted communities in real world networks. Instead, it is standard practice to treat some observed discrete-valued node attributes, or metadata, as ground truth. Here, we show that metadata are not the same as ground truth, and that treating them as such induces severe theoretical and practical problems. We prove that no algorithm can uniquely solve community detection, and we prove a general No Free Lunch theorem for community detection, which implies that there can be no algorithm that is optimal for all possible community detection tasks. However, community detection remains a powerful tool and node metadata still have value so a careful exploration of their relationship with network structure can yield insights of genuine worth. We illustrate this point by introducing two statistical techniques that can quantify the relationship between metadata and community structure for a broad class of models. We demonstrate these techniques using both synthetic and real-world networks, and for multiple types of metadata and community structure.
Bipartite networks are a common type of network data in which there are two types of vertices, and only vertices of different types can be connected. While bipartite networks exhibit community structure like their unipartite counterparts, existing approaches to bipartite community detection have drawbacks, including implicit parameter choices, loss of information through one-mode projections, and lack of interpretability. Here we solve the community detection problem for bipartite networks by formulating a bipartite stochastic block model, which explicitly includes vertex type information and may be trivially extended to $k$-partite networks. This bipartite stochastic block model yields a projection-free and statistically principled method for community detection that makes clear assumptions and parameter choices and yields interpretable results. We demonstrate this models ability to efficiently and accurately find community structure in synthetic bipartite networks with known structure and in real-world bipartite networks with unknown structure, and we characterize its performance in practical contexts.
Network similarity measures quantify how and when two networks are symmetrically related, including measures of statistical association such as pairwise distance or other correlation measures between networks or between the layers of a multiplex network, but neither can directly unveil whether there are hidden confounding network factors nor can they estimate when such correlation is underpinned by a causal relation. In this work we extend this pairwise conceptual framework to triplets of networks and quantify how and when a network is related to a second network directly or via the indirect mediation or interaction with a third network. Accordingly, we develop a simple and intuitive set-theoretic approach to quantify mediation and suppression between networks. We validate our theory with synthetic models and further apply it to triplets of real-world networks, unveiling mediation and suppression effects which emerge when considering different modes of interaction in online social networks and different routes of information processing in the brain.
Complex network theory aims to model and analyze complex systems that consist of multiple and interdependent components. Among all studies on complex networks, topological structure analysis is of the most fundamental importance, as it represents a natural route to understand the dynamics, as well as to synthesize or optimize the functions, of networks. A broad spectrum of network structural patterns have been respectively reported in the past decade, such as communities, multipartites, hubs, authorities, outliers, bow ties, and others. Here, we show that most individual real-world networks demonstrate multiplex structures. That is, a multitude of known or even unknown (hidden) patterns can simultaneously situate in the same network, and moreover they may be overlapped and nested with each other to collaboratively form a heterogeneous, nested or hierarchical organization, in which different connective phenomena can be observed at different granular levels. In addition, we show that the multiplex structures hidden in exploratory networks can be well defined as well as effectively recognized within an unified framework consisting of a set of proposed concepts, models, and algorithms. Our findings provide a strong evidence that most real-world complex systems are driven by a combination of heterogeneous mechanisms that may collaboratively shape their ubiquitous multiplex structures as we observe currently. This work also contributes a mathematical tool for analyzing different sources of networks from a new perspective of unveiling multiplex structures, which will be beneficial to multiple disciplines including sociology, economics and computer science.