ترغب بنشر مسار تعليمي؟ اضغط هنا

Grain: Improving Data Efficiency of Graph Neural Networks via Diversified Influence Maximization

200   0   0.0 ( 0 )
 نشر من قبل Wentao Zhang
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Data selection methods, such as active learning and core-set selection, are useful tools for improving the data efficiency of deep learning models on large-scale datasets. However, recent deep learning models have moved forward from independent and identically distributed data to graph-structured data, such as social networks, e-commerce user-item graphs, and knowledge graphs. This evolution has led to the emergence of Graph Neural Networks (GNNs) that go beyond the models existing data selection methods are designed for. Therefore, we present Grain, an efficient framework that opens up a new perspective through connecting data selection in GNNs with social influence maximization. By exploiting the common patterns of GNNs, Grain introduces a novel feature propagation concept, a diversified influence maximization objective with novel influence and diversity functions, and a greedy algorithm with an approximation guarantee into a unified framework. Empirical studies on public datasets demonstrate that Grain significantly improves both the performance and efficiency of data selection (including active learning and core-set selection) for GNNs. To the best of our knowledge, this is the first attempt to bridge two largely parallel threads of research, data selection, and social influence maximization, in the setting of GNNs, paving new ways for improving data efficiency.



قيم البحث

اقرأ أيضاً

Graph Neural Networks (GNNs) achieve an impressive performance on structured graphs by recursively updating the representation vector of each node based on its neighbors, during which parameterized transformation matrices should be learned for the no de feature updating. However, existing propagation schemes are far from being optimal since they do not fully utilize the relational information between nodes. We propose the information maximizing graph neural networks (IGNN), which maximizes the mutual information between edge states and transform parameters. We reformulate the mutual information as a differentiable objective via a variational approach. We compare our model against several recent variants of GNNs and show that our model achieves the state-of-the-art performance on multiple tasks including quantum chemistry regression on QM9 dataset, generalization capability from QM9 to larger molecular graphs, and prediction of molecular bioactivities relevant for drug discovery. The IGNN model is based on an elegant and fundamental idea in information theory as explained in the main text, and it could be easily generalized beyond the contexts of molecular graphs considered in this work. To encourage more future work in this area, all datasets and codes used in this paper will be released for public access.
83 - Xinhan Di , Pengqian Yu , Rui Bu 2019
A variety of graph neural networks (GNNs) frameworks for representation learning on graphs have been recently developed. These frameworks rely on aggregation and iteration scheme to learn the representation of nodes. However, information between node s is inevitably lost in the scheme during learning. In order to reduce the loss, we extend the GNNs frameworks by exploring the aggregation and iteration scheme in the methodology of mutual information. We propose a new approach of enlarging the normal neighborhood in the aggregation of GNNs, which aims at maximizing mutual information. Based on a series of experiments conducted on several benchmark datasets, we show that the proposed approach improves the state-of-the-art performance for four types of graph tasks, including supervised and semi-supervised graph classification, graph link prediction and graph edge generation and classification.
We propose a novel learning framework using neural mean-field (NMF) dynamics for inference and estimation problems on heterogeneous diffusion networks. Our new framework leverages the Mori-Zwanzig formalism to obtain an exact evolution equation of th e individual node infection probabilities, which renders a delay differential equation with memory integral approximated by learnable time convolution operators. Directly using information diffusion cascade data, our framework can simultaneously learn the structure of the diffusion network and the evolution of node infection probabilities. Connections between parameter learning and optimal control are also established, leading to a rigorous and implementable algorithm for training NMF. Moreover, we show that the projected gradient descent method can be employed to solve the challenging influence maximization problem, where the gradient is computed extremely fast by integrating NMF forward in time just once in each iteration. Extensive empirical studies show that our approach is versatile and robust to variations of the underlying diffusion network models, and significantly outperform existing approaches in accuracy and efficiency on both synthetic and real-world data.
158 - Qi Zhu , Yidan Xu , Haonan Wang 2020
Graph neural networks (GNNs) have been shown with superior performance in various applications, but training dedicated GNNs can be costly for large-scale graphs. Some recent work started to study the pre-training of GNNs. However, none of them provid e theoretical insights into the design of their frameworks, or clear requirements and guarantees towards the transferability of GNNs. In this work, we establish a theoretically grounded and practically useful framework for the transfer learning of GNNs. Firstly, we propose a novel view towards the essential graph information and advocate the capturing of it as the goal of transferable GNN training, which motivates the design of Ours, a novel GNN framework based on ego-graph information maximization to analytically achieve this goal. Secondly, we specify the requirement of structure-respecting node features as the GNN input, and derive a rigorous bound of GNN transferability based on the difference between the local graph Laplacians of the source and target graphs. Finally, we conduct controlled synthetic experiments to directly justify our theoretical conclusions. Extensive experiments on real-world networks towards role identification show consistent results in the rigorously analyzed setting of direct-transfering, while those towards large-scale relation prediction show promising results in the more generalized and practical setting of transfering with fine-tuning.
119 - Fan Wu , Yunhui Long , Ce Zhang 2021
Graph structured data have enabled several successful applications such as recommendation systems and traffic prediction, given the rich node features and edges information. However, these high-dimensional features and high-order adjacency informatio n are usually heterogeneous and held by different data holders in practice. Given such vertical data partition (e.g., one data holder will only own either the node features or edge information), different data holders have to develop efficient joint training protocols rather than directly transfer data to each other due to privacy concerns. In this paper, we focus on the edge privacy, and consider a training scenario where Bob with node features will first send training node features to Alice who owns the adjacency information. Alice will then train a graph neural network (GNN) with the joint information and release an inference API. During inference, Bob is able to provide test node features and query the API to obtain the predictions for test nodes. Under this setting, we first propose a privacy attack LinkTeller via influence analysis to infer the private edge information held by Alice via designing adversarial queries for Bob. We then empirically show that LinkTeller is able to recover a significant amount of private edges, outperforming existing baselines. To further evaluate the privacy leakage, we adapt an existing algorithm for differentially private graph convolutional network (DP GCN) training and propose a new DP GCN mechanism LapGraph. We show that these DP GCN mechanisms are not always resilient against LinkTeller empirically under mild privacy guarantees ($varepsilon>5$). Our studies will shed light on future research towards designing more resilient privacy-preserving GCN models; in the meantime, provide an in-depth understanding of the tradeoff between GCN model utility and robustness against potential privacy attacks.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا