ترغب بنشر مسار تعليمي؟ اضغط هنا

Collective Influence Algorithm to find influencers via optimal percolation in massively large social media

59   0   0.0 ( 0 )
 نشر من قبل Hernan A. Makse
 تاريخ النشر 2016
  مجال البحث فيزياء
والبحث باللغة English




اسأل ChatGPT حول البحث

We elaborate on a linear time implementation of the Collective Influence (CI) algorithm introduced by Morone, Makse, Nature 524, 65 (2015) to find the minimal set of influencers in a network via optimal percolation. We show that the computational complexity of CI is O(N log N) when removing nodes one-by-one, with N the number of nodes. This is made possible by using an appropriate data structure to process the CI values, and by the finite radius l of the CI sphere. Furthermore, we introduce a simple extension of CI when l is infinite, the CI propagation (CI_P) algorithm, that considers the global optimization of influence via message passing in the whole network and identifies a slightly smaller fraction of influencers than CI. Remarkably, CI_P is able to reproduce the exact analytical optimal percolation threshold obtained by Bau, Wormald, Random Struct. Alg. 21, 397 (2002) for cubic random regular graphs, leaving little improvement left for random graphs. We also introduce the Collective Immunization Belief Propagation algorithm (CI_BP), a belief-propagation (BP) variant of CI based on optimal immunization, which has the same performance as CI_P. However, this small augmented performance of the order of 1-2 % in the low influencers tail comes at the expense of increasing the computational complexity from O(N log N) to O(N^2 log N), rendering both, CI_P and CI_BP, prohibitive for finding influencers in modern-day big-data. The same nonlinear running time drawback pertains to a recently introduced BP-decimation (BPD) algorithm by Mugisha, Zhou, arXiv:1603.05781. For instance, we show that for big-data social networks of typically 200 million users (eg, active Twitter users sending 500 million tweets per day), CI finds the influencers in less than 3 hours running on a single CPU, while the BP algorithms (CI_P, CI_BP and BDP) would take more than 3,000 years to accomplish the same task.



قيم البحث

اقرأ أيضاً

The whole frame of interconnections in complex networks hinges on a specific set of structural nodes, much smaller than the total size, which, if activated, would cause the spread of information to the whole network [1]; or, if immunized, would preve nt the diffusion of a large scale epidemic [2,3]. Localizing this optimal, i.e. minimal, set of structural nodes, called influencers, is one of the most important problems in network science [4,5]. Despite the vast use of heuristic strategies to identify influential spreaders [6-14], the problem remains unsolved. Here, we map the problem onto optimal percolation in random networks to identify the minimal set of influencers, which arises by minimizing the energy of a many-body system, where the form of the interactions is fixed by the non-backtracking matrix [15] of the network. Big data analyses reveal that the set of optimal influencers is much smaller than the one predicted by previous heuristic centralities. Remarkably, a large number of previously neglected weakly-connected nodes emerges among the optimal influencers. These are topologically tagged as low-degree nodes surrounded by hierarchical coronas of hubs, and are uncovered only through the optimal collective interplay of all the influencers in the network. Eventually, the present theoretical framework may hold a larger degree of universality, being applicable to other hard optimization problems exhibiting a continuous transition from a known phase [16].
We apply statistical physics to study the task of resource allocation in random sparse networks with limited bandwidths for the transportation of resources along the links. Useful algorithms are obtained from recursive relations. Bottlenecks emerge w hen the bandwidths are small, causing an increase in the fraction of idle links. For a given total bandwidth per node, the efficiency of allocation increases with the network connectivity. In the high connectivity limit, we find a phase transition at a critical bandwidth, above which clusters of balanced nodes appear, characterised by a profile of homogenized resource allocation similar to the Maxwells construction.
Simulations of systems with quenched disorder are extremely demanding, suffering from the combined effect of slow relaxation and the need of performing the disorder average. As a consequence, new algorithms, improved implementations, and alternative and even purpose-built hardware are often instrumental for conducting meaningful studies of such systems. The ensuing demands regarding hardware availability and code complexity are substantial and sometimes prohibitive. We demonstrate how with a moderate coding effort leaving the overall structure of the simulation code unaltered as compared to a CPU implementation, very significant speed-ups can be achieved from a parallel code on GPU by mainly exploiting the trivial parallelism of the disorder samples and the near-trivial parallelism of the parallel tempering replicas. A combination of this massively parallel implementation with a careful choice of the temperature protocol for parallel tempering as well as efficient cluster updates allows us to equilibrate comparatively large systems with moderate computational resources.
We present a model that takes into account the coupling between evolutionary game dynamics and social influence. Importantly, social influence and game dynamics take place in different domains, which we model as different layers of a multiplex networ k. We show that the coupling between these dynamical processes can lead to cooperation in scenarios where the pure game dynamics predicts defection. In addition, we show that the structure of the network layers and the relation between them can further increase cooperation. Remarkably, if the layers are related in a certain way, the system can reach a polarized metastable state.These findings could explain the prevalence of polarization observed in many social dilemmas.
A Random Geometric Graph (RGG) ensemble is defined by the disordered distribution of its node locations. We investigate how this randomness drives sample-to-sample fluctuations in the dynamical properties of these graphs. We study the distributional properties of the algebraic connectivity which is informative of diffusion and synchronization timescales in graphs. We use numerical simulations to provide the first characterisation of the algebraic connectivity distribution for RGG ensembles. We find that the algebraic connectivity can show fluctuations relative to its mean on the order of $30 %$, even for relatively large RGG ensembles ($N=10^5$). We explore the factors driving these fluctuations for RGG ensembles with different choices of dimensionality, boundary conditions and node distributions. Within a given ensemble, the algebraic connectivity can covary with the minimum degree and can also be affected by the presence of density inhomogeneities in the nodal distribution. We also derive a closed-form expression for the expected algebraic connectivity for RGGs with periodic boundary conditions for general dimension.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا