أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Wei Shen

Reinforcement Learning for Load-balanced Parallel Particle Tracing

110 - Jiayi Xu , Hanqi Guo , Han-Wei Shen 2021

We explore an online learning reinforcement learning (RL) paradigm for optimizing parallel particle tracing performance in distributed-memory systems. Our method combines three novel components: (1) a workload donation model, (2) a high-order workloa d estimation model, and (3) a communication cost model, to optimize the performance of data-parallel particle tracing dynamically. First, we design an RL-based workload donation model. Our workload donation model monitors the workload of processes and creates RL agents to donate particles and data blocks from high-workload processes to low-workload processes to minimize the execution time. The agents learn the donation strategy on-the-fly based on reward and cost functions. The reward and cost functions are designed to consider the processes workload change and the data transfer cost for every donation action. Second, we propose an online workload estimation model, in order to help our RL model estimate the workload distribution of processes in future computations. Third, we design the communication cost model that considers both block and particle data exchange costs, helping the agents make effective decisions with minimized communication cost. We demonstrate that our algorithm adapts to different flow behaviors in large-scale fluid dynamics, ocean, and weather simulation data. Our algorithm improves parallel particle tracing performance in terms of parallel efficiency, load balance, and costs of I/O and communication for evaluations up to 16,384 processors.

الرسم الحاسوبي الذكاء الاصطناعي النظم الموزعة والتوازية والحوسبة العنقودية

Anomalous Hall effect in ferrimagnetic metal RMn6Sn6 (R = Tb, Dy, Ho) with clean Mn kagome lattice

372 - Lingling Gao , Shiwei Shen , Qi Wang 2021

Kagome lattice, made of corner-sharing triangles, provides an excellent platform for hosting exotic topological quantum states. Here we systematically studied the magnetic and transport properties of RMn6Sn6 (R = Tb, Dy, Ho) with clean Mn kagome latt ice. All the compounds have a collinear ferrimagnetic structure with different easy axis at low temperature. The low-temperature magnetoresistance (MR) is positive and has no tendency to saturate below 7 T, while the MR gradually declines and becomes negative with the increasing temperature. A large intrinsic anomalous Hall conductivity about 250 {Omega}-1cm-1, 40 {Omega}-1cm-1, 95 {Omega}-1cm-1 is observed for TbMn6Sn6, DyMn6Sn6, HoMn6Sn6, respectively. Our results imply that RMn6Sn6 system is an excellent platform to discover other intimately related topological or quantum phenomena and also tune the electronic and magnetic properties in future studies.

المنصة الفائقة علم المواد

Single Node Injection Attack against Graph Neural Networks

148 - Shuchang Tao , Qi Cao , Huawei Shen 2021

Node injection attack on Graph Neural Networks (GNNs) is an emerging and practical attack scenario that the attacker injects malicious nodes rather than modifying original nodes or edges to affect the performance of GNNs. However, existing node injec tion attacks ignore extremely limited scenarios, namely the injected nodes might be excessive such that they may be perceptible to the target GNN. In this paper, we focus on an extremely limited scenario of single node injection evasion attack, i.e., the attacker is only allowed to inject one single node during the test phase to hurt GNNs performance. The discreteness of network structure and the coupling effect between network structure and node features bring great challenges to this extremely limited scenario. We first propose an optimization-based method to explore the performance upper bound of single node injection evasion attack. Experimental results show that 100%, 98.60%, and 94.98% nodes on three public datasets are successfully attacked even when only injecting one node with one edge, confirming the feasibility of single node injection evasion attack. However, such an optimization-based method needs to be re-optimized for each attack, which is computationally unbearable. To solve the dilemma, we further propose a Generalizable Node Injection Attack model, namely G-NIA, to improve the attack efficiency while ensuring the attack performance. Experiments are conducted across three well-known GNNs. Our proposed G-NIA significantly outperforms state-of-the-art baselines and is 500 times faster than the optimization-based method when inferring.

التعلم الآلي التشفير والأمن

Inductive Matrix Completion Using Graph Autoencoder

240 - Wei Shen , Chuheng Zhang , Yun Tian 2021

Recently, the graph neural network (GNN) has shown great power in matrix completion by formulating a rating matrix as a bipartite graph and then predicting the link between the corresponding user and item nodes. The majority of GNN-based matrix compl etion methods are based on Graph Autoencoder (GAE), which considers the one-hot index as input, maps a user (or item) index to a learnable embedding, applies a GNN to learn the node-specific representations based on these learnable embeddings and finally aggregates the representations of the target users and its corresponding item nodes to predict missing links. However, without node content (i.e., side information) for training, the user (or item) specific representation can not be learned in the inductive setting, that is, a model trained on one group of users (or items) cannot adapt to new users (or items). To this end, we propose an inductive matrix completion method using GAE (IMC-GAE), which utilizes the GAE to learn both the user-specific (or item-specific) representation for personalized recommendation and local graph patterns for inductive matrix completion. Specifically, we design two informative node features and employ a layer-wise node dropout scheme in GAE to learn local graph patterns which can be generalized to unseen data. The main contribution of our paper is the capability to efficiently learn local graph patterns in GAE, with good scalability and superior expressiveness compared to previous GNN-based matrix completion methods. Furthermore, extensive experiments demonstrate that our model achieves state-of-the-art performance on several matrix completion benchmarks. Our official code is publicly available.

التعلم الآلي الذكاء الاصطناعي

Signed Bipartite Graph Neural Networks

91 - Junjie Huang , Huawei Shen , Qi Cao 2021

Signed networks are such social networks having both positive and negative links. A lot of theories and algorithms have been developed to model such networks (e.g., balance theory). However, previous work mainly focuses on the unipartite signed netwo rks where the nodes have the same type. Signed bipartite networks are different from classical signed networks, which contain two different node sets and signed links between two node sets. Signed bipartite networks can be commonly found in many fields including business, politics, and academics, but have been less studied. In this work, we firstly define the signed relationship of the same set of nodes and provide a new perspective for analyzing signed bipartite networks. Then we do some comprehensive analysis of balance theory from two perspectives on several real-world datasets. Specifically, in the peer review dataset, we find that the ratio of balanced isomorphism in signed bipartite networks increased after rebuttal phases. Guided by these two perspectives, we propose a novel Signed Bipartite Graph Neural Networks (SBGNNs) to learn node embeddings for signed bipartite networks. SBGNNs follow most GNNs message-passing scheme, but we design new message functions, aggregation functions, and update functions for signed bipartite networks. We validate the effectiveness of our model on four real-world datasets on Link Sign Prediction task, which is the main machine learning task for signed networks. Experimental results show that our SBGNN model achieves significant improvement compared with strong baseline methods, including feature-based methods and network embedding methods.

الشبكات الاجتماعية والمعلومات الذكاء الاصطناعي

Pre-training of Temporal Convolutional Neural Networks for Popularity Prediction

67 - Qi Cao , Huawei Shen , Yuanhao Liu 2021

Predicting the popularity of online content is a fundamental problem in various application areas. One practical challenge for popularity prediction takes roots in the different settings of popularity prediction tasks in different situations, e.g., t he varying lengths of the observation time window or prediction horizon. In other words, a good model for popularity prediction is desired to handle various tasks with different settings. However, the conventional paradigm for popularity prediction is training a separate prediction model for each prediction task, and thus the obtained model for one task is difficult to be generalized to other tasks, causing a great waste of training time and computational resources. To solve this issue, in this paper, we propose a novel pre-training framework for popularity prediction, aiming to pre-train a general deep representation model by learning intrinsic knowledge about popularity dynamics from the readily available diffusion cascades. We design a novel pretext task for pre-training, i.e., temporal context prediction for two randomly sampled time slices of popularity dynamics, impelling the deep prediction model to effectively capture the characteristics of popularity dynamics. Taking the state-of-the-art deep model, i.e., temporal convolutional neural network, as an instantiation of our proposed framework, experimental results conducted on both Sina Weibo and Twitter datasets demonstrate both the effectiveness and efficiency of the proposed pre-training framework for multiple popularity prediction tasks.

الشبكات الاجتماعية والمعلومات

TrUMAn: Trope Understanding in Movies and Animations

86 - Hung-Ting Su , Po-Wei Shen , Bing-Chen Tsai 2021

Understanding and comprehending video content is crucial for many real-world applications such as search and recommendation systems. While recent progress of deep learning has boosted performance on various tasks using visual cues, deep cognition to reason intentions, motivation, or causality remains challenging. Existing datasets that aim to examine video reasoning capability focus on visual signals such as actions, objects, relations, or could be answered utilizing text bias. Observing this, we propose a novel task, along with a new dataset: Trope Understanding in Movies and Animations (TrUMAn), with 2423 videos associated with 132 tropes, intending to evaluate and develop learning systems beyond visual signals. Tropes are frequently used storytelling devices for creative works. By coping with the trope understanding task and enabling the deep cognition skills of machines, data mining applications and algorithms could be taken to the next level. To tackle the challenging TrUMAn dataset, we present a Trope Understanding and Storytelling (TrUSt) with a new Conceptual Storyteller module, which guides the video encoder by performing video storytelling on a latent space. Experimental results demonstrate that state-of-the-art learning systems on existing tasks reach only 12.01% of accuracy with raw input signals. Also, even in the oracle case with human-annotated descriptions, BERT contextual embedding achieves at most 28% of accuracy. Our proposed TrUSt boosts the model performance and reaches 13.94% performance. We also provide detailed analysis to pave the way for future research. TrUMAn is publicly available at:https://www.cmlab.csie.ntu.edu.tw/project/trope

الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط

Inductive Representation Based Graph Convolution Network for Collaborative Filtering

117 - Yunfan Wu , Qi Cao , Huawei Shen 2021

In recent years, graph neural networks (GNNs) have shown powerful ability in collaborative filtering, which is a widely adopted recommendation scenario. While without any side information, existing graph neural network based methods generally learn a one-hot embedding for each user or item as the initial input representation of GNNs. However, such one-hot embedding is intrinsically transductive, making these methods with no inductive ability, i.e., failing to deal with new users or new items that are unseen during training. Besides, the number of model parameters depends on the number of users and items, which is expensive and not scalable. In this paper, we give a formal definition of inductive recommendation and solve the above problems by proposing Inductive representation based Graph Convolutional Network (IGCN) for collaborative filtering. Specifically, we design an inductive representation layer, which utilizes the interaction behavior with core users or items as the initial representation, improving the general recommendation performance while bringing inductive ability. Note that, the number of parameters of IGCN only depends on the number of core users or items, which is adjustable and scalable. Extensive experiments on three public benchmarks demonstrate the state-of-the-art performance of IGCN in both transductive and inductive recommendation scenarios, while with remarkably fewer model parameters. Our implementations are available here in PyTorch.

استرجاع المعلومات

CasEE: A Joint Learning Framework with Cascade Decoding for Overlapping Event Extraction

325 - Jiawei Sheng , Shu Guo , Bowen Yu 2021

Event extraction (EE) is a crucial information extraction task that aims to extract event information in texts. Most existing methods assume that events appear in sentences without overlaps, which are not applicable to the complicated overlapping eve nt extraction. This work systematically studies the realistic event overlapping problem, where a word may serve as triggers with several types or arguments with different roles. To tackle the above problem, we propose a novel joint learning framework with cascade decoding for overlapping event extraction, termed as CasEE. Particularly, CasEE sequentially performs type detection, trigger extraction and argument extraction, where the overlapped targets are extracted separately conditioned on the specific former prediction. All the subtasks are jointly learned in a framework to capture dependencies among the subtasks. The evaluation on a public event extraction benchmark FewFC demonstrates that CasEE achieves significant improvements on overlapping event extraction over previous competitive methods.

الحساب واللغة

Self-supervised GANs with Label Augmentation

109 - Liang Hou , Huawei Shen , Qi Cao 2021

Recently, transformation-based self-supervised learning has been applied to generative adversarial networks (GANs) to mitigate the catastrophic forgetting problem of discriminator by learning stable representations. However, the separate self-supervi sed tasks in existing self-supervised GANs cause an inconsistent goal with generative modeling due to the learning of the generator from their generator distribution-agnostic classifiers. To address this issue, we propose a novel self-supervised GANs framework with label augmentation, i.e., augmenting the GAN labels (real or fake) with the self-supervised pseudo-labels. In particular, the discriminator and the self-supervised classifier are unified to learn a single task that predicts the augmented label such that the discriminator/classifier is aware of the generator distribution, while the generator tries to confuse the discriminator/classifier by optimizing the discrepancy between the transformed real and generated distributions. Theoretically, we prove that the generator, at the equilibrium point, converges to replicate the data distribution. Empirically, we demonstrate that the proposed method significantly outperforms competitive baselines on both generative modeling and representation learning across benchmark datasets.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد