ترغب بنشر مسار تعليمي؟ اضغط هنا

Predicting Graph Categories from Structural Properties

177   0   0.0 ( 0 )
 نشر من قبل Karl Schmitt
 تاريخ النشر 2018
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

This paper has been withdrawn from arXiv.org due to a disagreement among the authors related to several peer-review comments received prior to submission on arXiv.org. Even though the current version of this paper is withdrawn, there was no disagreement between authors on the novel work in this paper. One specific issue was the discussion of related work by Ikehara & Clauset (found on page 8 of the previously posted version). Peer-review comments on a similar version made ALL authors aware that the discussion misrepresented their work prior to submission to arXiv.org. However, some authors choose to post to arXiv a minimally updated version without the consent of all authors or properly addressing this attribution issue. ================ Original Paper Abstract: Complex networks are often categorized according to the underlying phenomena that they represent such as molecular interactions, re-tweets, and brain activity. In this work, we investigate the problem of predicting the category (domain) of arbitrary networks. This includes complex networks from different domains as well as synthetically generated graphs from five different network models. A classification accuracy of $96.6%$ is achieved using a random forest classifier with both real and synthetic networks. This work makes two important findings. First, our results indicate that complex networks from various domains have distinct structural properties that allow us to predict with high accuracy the category of a new previously unseen network. Second, synthetic graphs are trivial to classify as the classification model can predict with near-certainty the network model used to generate it. Overall, the results demonstrate that networks drawn from different domains (and network models) are trivial to distinguish using only a handful of simple structural properties.

قيم البحث

اقرأ أيضاً

97 - En-Yu Yu , Yan Fu , Jun-Lin Zhou 2021
Many real-world systems can be expressed in temporal networks with nodes playing far different roles in structure and function and edges representing the relationships between nodes. Identifying critical nodes can help us control the spread of public opinions or epidemics, predict leading figures in academia, conduct advertisements for various commodities, and so on. However, it is rather difficult to identify critical nodes because the network structure changes over time in temporal networks. In this paper, considering the sequence topological information of temporal networks, a novel and effective learning framework based on the combination of special GCNs and RNNs is proposed to identify nodes with the best spreading ability. The effectiveness of the approach is evaluated by weighted Susceptible-Infected-Recovered model. Experimental results on four real-world temporal networks demonstrate that the proposed method outperforms both traditional and deep learning benchmark methods in terms of the Kendall $tau$ coefficient and top $k$ hit rate.
Recently, there has been considerable research interest in graph clustering aimed at data partition using the graph information. However, one limitation of the most of graph-based methods is that they assume the graph structure to operate is fixed an d reliable. And there are inevitably some edges in the graph that are not conducive to graph clustering, which we call spurious edges. This paper is the first attempt to employ graph pooling technique for node clustering and we propose a novel dual graph embedding network (DGEN), which is designed as a two-step graph encoder connected by a graph pooling layer to learn the graph embedding. In our model, it is assumed that if a node and its nearest neighboring node are close to the same clustering center, this node is an informative node and this edge can be considered as a cluster-friendly edge. Based on this assumption, the neighbor cluster pooling (NCPool) is devised to select the most informative subset of nodes and the corresponding edges based on the distance of nodes and their nearest neighbors to the cluster centers. This can effectively alleviate the impact of the spurious edges on the clustering. Finally, to obtain the clustering assignment of all nodes, a classifier is trained using the clustering results of the selected nodes. Experiments on five benchmark graph datasets demonstrate the superiority of the proposed method over state-of-the-art algorithms.
Increased availability of epidemiological data, novel digital data streams, and the rise of powerful machine learning approaches have generated a surge of research activity on real-time epidemic forecast systems. In this paper, we propose the use of a novel data source, namely retail market data to improve seasonal influenza forecasting. Specifically, we consider supermarket retail data as a proxy signal for influenza, through the identification of sentinel baskets, i.e., products bought together by a population of selected customers. We develop a nowcasting and forecasting framework that provides estimates for influenza incidence in Italy up to 4 weeks ahead. We make use of the Support Vector Regression (SVR) model to produce the predictions of seasonal flu incidence. Our predictions outperform both a baseline autoregressive model and a second baseline based on product purchases. The results show quantitatively the value of incorporating retail market data in forecasting models, acting as a proxy that can be used for the real-time analysis of epidemics.
Software development is becoming increasingly open and collaborative with the advent of platforms such as GitHub. Given its crucial role, there is a need to better understand and model the dynamics of GitHub as a social platform. Previous work has mo stly considered the dynamics of traditional social networking sites like Twitter and Facebook. We propose GitEvolve, a system to predict the evolution of GitHub repositories and the different ways by which users interact with them. To this end, we develop an end-to-end multi-task sequential deep neural network that given some seed events, simultaneously predicts which user-group is next going to interact with a given repository, what the type of the interaction is, and when it happens. To facilitate learning, we use graph based representation learning to encode relationship between repositories. We map users to groups by modelling common interests to better predict popularity and to generalize to unseen users during inference. We introduce an artificial event type to better model varying levels of activity of repositories in the dataset. The proposed multi-task architecture is generic and can be extended to model information diffusion in other social networks. In a series of experiments, we demonstrate the effectiveness of the proposed model, using multiple metrics and baselines. Qualitative analysis of the models ability to predict popularity and forecast trends proves its applicability.
The problem of predicting peoples participation in real-world events has received considerable attention as it offers valuable insights for human behavior analysis and event-related advertisement. Today social networks (e.g. Twitter) widely reflect l arge popular events where people discuss their interest with friends. Event participants usually stimulate friends to join the event which propagates a social influence in the network. In this paper, we propose to model the social influence of friends on event attendance. We consider non-geotagged posts besides structures of social groups to infer users attendance. To leverage the information on network topology we apply some of recent graph embedding techniques such as node2vec, HARP and Poincar`e. We describe the approach followed to design the feature space and feed it to a neural network. The performance evaluation is conducted using two large music festivals datasets, namely the VFestival and Creamfields. The experimental results show that our classifier outperforms the state-of-the-art baseline with 89% accuracy observed for the VFestival dataset.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا