No Arabic abstract
Many systems exhibit complex temporal dynamics due to the presence of different processes taking place simultaneously. Temporal networks provide a framework to describe the time-resolve interactions between components of a system. An important task when investigating such systems is to extract a simplified view of the temporal network, which can be done via dynamic community detection or clustering. Several works have generalized existing community detection methods for static networks to temporal networks, but they usually rely on temporal aggregation over time windows, the assumption of an underlying stationary process, or sequences of different stationary epochs. Here, we derive a method based on a dynamical process evolving on the temporal network and restricted by its activation pattern that allows to consider the full temporal information of the system. Our method allows dynamics that do not necessarily reach a steady state, or follow a sequence of stationary states. Our framework encompasses several well-known heuristics as special cases. We show that our method provides a natural way to disentangle the different natural dynamical scales present in a system. We demonstrate our method abilities on synthetic and real-world examples.
Research into detection of dense communities has recently attracted increasing attention within network science, various metrics for detection of such communities have been proposed. The most popular metric -- Modularity -- is based on the so-called rule that the links within communities are denser than external links among communities, has become the default. However, this default metric suffers from ambiguity, and worse, all augmentations of modularity and based on a narrow intuition of what it means to form a community. We argue that in specific, but quite common systems, links within a community are not necessarily more common than links between communities. Instead we propose that the defining characteristic of a community is that links are more predictable within a community rather than between communities. In this paper, based on the effect of communities on link prediction, we propose a novel metric for the community detection based directly on this feature. We find that our metric is more robustness than traditional modularity. Consequently, we can achieve an evaluation of algorithm stability for the same detection algorithm in different networks. Our metric also can directly uncover the false community detection, and infer more statistical characteristics for detection algorithms.
Time-stamped data are increasingly available for many social, economic, and information systems that can be represented as networks growing with time. The World Wide Web, social contact networks, and citation networks of scientific papers and online news articles, for example, are of this kind. Static methods can be inadequate for the analysis of growing networks as they miss essential information on the systems dynamics. At the same time, time-aware methods require the choice of an observation timescale, yet we lack principled ways to determine it. We focus on the popular community detection problem which aims to partition a networks nodes into meaningful groups. We use a multi-layer quality function to show, on both synthetic and real datasets, that the observation timescale that leads to optimal communities is tightly related to the systems intrinsic aging timescale that can be inferred from the time-stamped network data. The use of temporal information leads to drastically different conclusions on the community structure of real information networks, which challenges the current understanding of the large-scale organization of growing networks. Our findings indicate that before attempting to assess structural patterns of evolving networks, it is vital to uncover the timescales of the dynamical processes that generated them.
Embedding a network in hyperbolic space can reveal interesting features for the network structure, especially in terms of self-similar characteristics. The hidden metric space, which can be thought of as the underlying structure of the network, is able to preserve some interesting features generally observed in real-world networks such as heterogeneity in the degree distribution, high clustering coefficient, and small-world effect. Moreover, the angular distribution of the nodes in the hyperbolic plane reveals a community structure of the embedded network. It is worth noting that, while a large body of literature compares well-known community detection algorithms, there is still no consensus on what defines an ideal community partition on a network. Moreover, heuristics for communities found on networks embedded in the hyperbolic space have been investigated here for the first time. We compare the partitions found on embedded networks to the partitions obtained before the embedding step, both for a synthetic network and for two real-world networks. The second part of this paper presents the application of our pipeline to a network of retweets in the context of the Italian elections. Our results uncover a community structure reflective of the political spectrum, encouraging further research on the application of community detection heuristics to graphs mapped onto hyperbolic planes.
Community detection is expensive, and the cost generally depends at least linearly on the number of vertices in the graph. We propose working with a reduced graph that has many fewer nodes but nonetheless captures key community structure. The K-core of a graph is the largest subgraph within which each node has at least K connections. We propose a framework that accelerates community detection by applying an expensive algorithm (modularity optimization, the Louvain method, spectral clustering, etc.) to the K-core and then using an inexpensive heuristic (such as local modularity maximization) to infer community labels for the remaining nodes. Our experiments demonstrate that the proposed framework can reduce the running time by more than 80% while preserving the quality of the solutions. Recent theoretical investigations provide support for using the K-core as a reduced representation.
Grouping objects into clusters based on similarities or weights between them is one of the most important problems in science and engineering. In this work, by extending message passing algorithms and spectral algorithms proposed for unweighted community detection problem, we develop a non-parametric method based on statistical physics, by mapping the problem to Potts model at the critical temperature of spin glass transition and applying belief propagation to solve the marginals corresponding to the Boltzmann distribution. Our algorithm is robust to over-fitting and gives a principled way to determine whether there are significant clusters in the data and how many clusters there are. We apply our method to different clustering tasks and use extensive numerical experiments to illustrate the advantage of our method over existing algorithms. In the community detection problem in weighted and directed networks, we show that our algorithm significantly outperforms existing algorithms. In the clustering problem when the data was generated by mixture models in the sparse regime we show that our method works to the theoretical limit of detectability and gives accuracy very close to that of the optimal Bayesian inference. In the semi-supervised clustering problem, our method only needs several labels to work perfectly in classic datasets. Finally, we further develop Thouless-Anderson-Palmer equations which reduce heavily the computation complexity in dense-networks but gives almost the same performance as belief propagation.