ترغب بنشر مسار تعليمي؟ اضغط هنا

The art of community detection

160   0   0.0 ( 0 )
 نشر من قبل Natali Gulbahce
 تاريخ النشر 2008
  مجال البحث فيزياء
والبحث باللغة English




اسأل ChatGPT حول البحث

Networks in nature possess a remarkable amount of structure. Via a series of data-driven discoveries, the cutting edge of network science has recently progressed from positing that the random graphs of mathematical graph theory might accurately describe real networks to the current viewpoint that networks in nature are highly complex and structured entities. The identification of high order structures in networks unveils insights into their functional organization. Recently, Clauset, Moore, and Newman, introduced a new algorithm that identifies such heterogeneities in complex networks by utilizing the hierarchy that necessarily organizes the many levels of structure. Here, we anchor their algorithm in a general community detection framework and discuss the future of community detection.



قيم البحث

اقرأ أيضاً

Research into detection of dense communities has recently attracted increasing attention within network science, various metrics for detection of such communities have been proposed. The most popular metric -- Modularity -- is based on the so-called rule that the links within communities are denser than external links among communities, has become the default. However, this default metric suffers from ambiguity, and worse, all augmentations of modularity and based on a narrow intuition of what it means to form a community. We argue that in specific, but quite common systems, links within a community are not necessarily more common than links between communities. Instead we propose that the defining characteristic of a community is that links are more predictable within a community rather than between communities. In this paper, based on the effect of communities on link prediction, we propose a novel metric for the community detection based directly on this feature. We find that our metric is more robustness than traditional modularity. Consequently, we can achieve an evaluation of algorithm stability for the same detection algorithm in different networks. Our metric also can directly uncover the false community detection, and infer more statistical characteristics for detection algorithms.
Time-stamped data are increasingly available for many social, economic, and information systems that can be represented as networks growing with time. The World Wide Web, social contact networks, and citation networks of scientific papers and online news articles, for example, are of this kind. Static methods can be inadequate for the analysis of growing networks as they miss essential information on the systems dynamics. At the same time, time-aware methods require the choice of an observation timescale, yet we lack principled ways to determine it. We focus on the popular community detection problem which aims to partition a networks nodes into meaningful groups. We use a multi-layer quality function to show, on both synthetic and real datasets, that the observation timescale that leads to optimal communities is tightly related to the systems intrinsic aging timescale that can be inferred from the time-stamped network data. The use of temporal information leads to drastically different conclusions on the community structure of real information networks, which challenges the current understanding of the large-scale organization of growing networks. Our findings indicate that before attempting to assess structural patterns of evolving networks, it is vital to uncover the timescales of the dynamical processes that generated them.
380 - Hua-Wei Shen , Xue-Qi Cheng 2010
Spectral analysis has been successfully applied at the detection of community structure of networks, respectively being based on the adjacency matrix, the standard Laplacian matrix, the normalized Laplacian matrix, the modularity matrix, the correlat ion matrix and several other variants of these matrices. However, the comparison between these spectral methods is less reported. More importantly, it is still unclear which matrix is more appropriate for the detection of community structure. This paper answers the question through evaluating the effectiveness of these five matrices against the benchmark networks with heterogeneous distributions of node degree and community size. Test results demonstrate that the normalized Laplacian matrix and the correlation matrix significantly outperform the other three matrices at identifying the community structure of networks. This indicates that it is crucial to take into account the heterogeneous distribution of node degree when using spectral analysis for the detection of community structure. In addition, to our surprise, the modularity matrix exhibits very similar performance to the adjacency matrix, which indicates that the modularity matrix does not gain desired benefits from using the configuration model as reference network with the consideration of the node degree heterogeneity.
146 - Nicola Scafetta 2012
Probability distributions of human displacements has been fit with exponentially truncated Levy flights or fat tailed Pareto inverse power law probability distributions. Thus, people usually stay within a given location (for example, the city of resi dence), but with a non-vanishing frequency they visit nearby or far locations too. Herein, we show that an important empirical distribution of human displacements (range: from 1 to 1000 km) can be well fit by three consecutive Pareto distributions with simple integer exponents equal to 1, 2 and ($gtrapprox$) 3. These three exponents correspond to three displacement range zones of about 1 km $lesssim Delta r lesssim$ 10 km, 10 km $lesssim Delta r lesssim$ 300 km and 300 km $lesssim Delta r lesssim $ 1000 km, respectively. These three zones can be geographically and physically well determined as displacements within a city, visits to nearby cities that may occur within just one-day trips, and visit to far locations that may require multi-days trips. The incremental integer values of the three exponents can be easily explained with a three-scale mobility cost/benefit model for human displacements based on simple geometrical constrains. Essentially, people would divide the space into three major regions (close, medium and far distances) and would assume that the travel benefits are randomly/uniformly distributed mostly only within specific urban-like areas.
We have two main aims in this paper. First we use theories of disease spreading on networks to look at the COVID-19 epidemic on the basis of individual contacts -- these give rise to predictions which are often rather different from the homogeneous m ixing approaches usually used. Our second aim is to look at the role of social deprivation, again using networks as our basis, in the spread of this epidemic. We choose the city of Kolkata as a case study, but assert that the insights so obtained are applicable to a wide variety of urban environments which are densely populated and where social inequalities are rampant. Our predictions of hotspots are found to be in good agreement with those currently being identifed empirically as containment zones and provide a useful guide for identifying potential areas of concern.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا