ترغب بنشر مسار تعليمي؟ اضغط هنا

The Infinity Mirror Test for Graph Models

101   0   0.0 ( 0 )
 نشر من قبل Satyaki Sikdar
 تاريخ النشر 2020
والبحث باللغة English




اسأل ChatGPT حول البحث

Graph models, like other machine learning models, have implicit and explicit biases built-in, which often impact performance in nontrivial ways. The models faithfulness is often measured by comparing the newly generated graph against the source graph using any number or combination of graph properties. Differences in the size or topology of the generated graph therefore indicate a loss in the model. Yet, in many systems, errors encoded in loss functions are subtle and not well understood. In the present work, we introduce the Infinity Mirror test for analyzing the robustness of graph models. This straightforward stress test works by repeatedly fitting a model to its own outputs. A hypothetically perfect graph model would have no deviation from the source graph; however, the models implicit biases and assumptions are exaggerated by the Infinity Mirror test, exposing potential issues that were previously obscured. Through an analysis of thousands of experiments on synthetic and real-world graphs, we show that several conventional graph models degenerate in exciting and informative ways. We believe that the observed degenerative patterns are clues to the future development of better graph models.



قيم البحث

اقرأ أيضاً

Outliers are samples that are generated by different mechanisms from other normal data samples. Graphs, in particular social network graphs, may contain nodes and edges that are made by scammers, malicious programs or mistakenly by normal users. Dete cting outlier nodes and edges is important for data mining and graph analytics. However, previous research in the field has merely focused on detecting outlier nodes. In this article, we study the properties of edges and propose outlier edge detection algorithms using two random graph generation models. We found that the edge-ego-network, which can be defined as the induced graph that contains two end nodes of an edge, their neighboring nodes and the edges that link these nodes, contains critical information to detect outlier edges. We evaluated the proposed algorithms by injecting outlier edges into some real-world graph data. Experiment results show that the proposed algorithms can effectively detect outlier edges. In particular, the algorithm based on the Preferential Attachment Random Graph Generation model consistently gives good performance regardless of the test graph data. Further more, the proposed algorithms are not limited in the area of outlier edge detection. We demonstrate three different applications that benefit from the proposed algorithms: 1) a preprocessing tool that improves the performance of graph clustering algorithms; 2) an outlier node detection algorithm; and 3) a novel noisy data clustering algorithm. These applications show the great potential of the proposed outlier edge detection techniques.
Do algorithms for drawing graphs pass the Turing Test? That is, are their outputs indistinguishable from graphs drawn by humans? We address this question through a human-centred experiment, focusing on `small graphs, of a size for which it would be r easonable for someone to choose to draw the graph manually. Overall, we find that hand-drawn layouts can be distinguished from those generated by graph drawing algorithms, although this is not always the case for graphs drawn by force-directed or multi-dimensional scaling algorithms, making these good candidates for Turing Test success. We show that, in general, hand-drawn graphs are judged to be of higher quality than automatically generated ones, although this result varies with graph size and algorithm.
Graph models have long been used in lieu of real data which can be expensive and hard to come by. A common class of models constructs a matrix of probabilities, and samples an adjacency matrix by flipping a weighted coin for each entry. Examples incl ude the ErdH{o}s-R{e}nyi model, Chung-Lu model, and the Kronecker model. Here we present the HyperKron Graph model: an extension of the Kronecker Model, but with a distribution over hyperedges. We prove that we can efficiently generate graphs from this model in order proportional to the number of edges times a small log-factor, and find that in practice the runtime is linear with respect to the number of edges. We illustrate a number of useful features of the HyperKron model including non-trivial clustering and highly skewed degree distributions. Finally, we fit the HyperKron model to real-world networks, and demonstrate the models flexibility with a complex application of the HyperKron model to networks with coherent feed-forward loops.
The topological structure of complex networks has fascinated researchers for several decades, resulting in the discovery of many universal properties and reoccurring characteristics of different kinds of networks. However, much less is known today ab out the network dynamics: indeed, complex networks in reality are not static, but rather dynamically evolve over time. Our paper is motivated by the empirical observation that network evolution patterns seem far from random, but exhibit structure. Moreover, the specific patterns appear to depend on the network type, contradicting the existence of a one fits it all model. However, we still lack observables to quantify these intuitions, as well as metrics to compare graph evolutions. Such observables and metrics are needed for extrapolating or predicting evolutions, as well as for interpolating graph evolutions. To explore the many faces of graph dynamics and to quantify temporal changes, this paper suggests to build upon the concept of centrality, a measure of node importance in a network. In particular, we introduce the notion of centrality distance, a natural similarity measure for two graphs which depends on a given centrality, characterizing the graph type. Intuitively, centrality distances reflect the extent to which (non-anonymous) node roles are different or, in case of dynamic graphs, have changed over time, between two graphs. We evaluate the centrality distance approach for five evolutionary models and seven real-world social and physical networks. Our results empirically show the usefulness of centrality distances for characterizing graph dynamics compared to a null-model of random evolution, and highlight the differences between the considered scenarios. Interestingly, our approach allows us to compare the dynamics of very different networks, in terms of scale and evolution speed.
Graph clustering is an important technique to understand the relationships between the vertices in a big graph. In this paper, we propose a novel random-walk-based graph clustering method. The proposed method restricts the reach of the walking agent using an inflation function and a normalization function. We analyze the behavior of the limited random walk procedure and propose a novel algorithm for both global and local graph clustering problems. Previous random-walk-based algorithms depend on the chosen fitness function to find the clusters around a seed vertex. The proposed algorithm tackles the problem in an entirely different manner. We use the limited random walk procedure to find attracting vertices in a graph and use them as features to cluster the vertices. According to the experimental results on the simulated graph data and the real-world big graph data, the proposed method is superior to the state-of-the-art methods in solving graph clustering problems. Since the proposed method uses the embarrassingly parallel paradigm, it can be efficiently implemented and embedded in any parallel computing environment such as a MapReduce framework. Given enough computing resources, we are capable of clustering graphs with millions of vertices and hundreds millions of edges in a reasonable time.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا