ترغب بنشر مسار تعليمي؟ اضغط هنا

92 - Tao Zhang 2021
We construct double cross biproduct and bi-cycle bicrossproduct Lie bialgebras from braided Lie bialgebras. The main result generalizes Majids matched pairs of Lie algebras, Drinfelds quantum double, and Masuokas cross product Lie bialgebras.
107 - Tao Zhang 2021
We introduce the conception of matched pairs of $(H, beta)$-Lie algebras, construct an $(H, beta)$-Lie algebra through them. We prove that the cocycle twist of a matched pair of $(H, beta)$-Lie algebras can also be matched.
Graph neural networks (GNNs) have recently achieved state-of-the-art performance in many graph-based applications. Despite the high expressive power, they typically need to perform an expensive recursive neighborhood expansion in multiple training ep ochs and face a scalability issue. Moreover, most of them are inflexible since they are restricted to fixed-hop neighborhoods and insensitive to actual receptive field demands for different nodes. We circumvent these limitations by introducing a scalable and flexible Graph Attention Multilayer Perceptron (GAMLP). With the separation of the non-linear transformation and feature propagation, GAMLP significantly improves the scalability and efficiency by performing the propagation procedure in a pre-compute manner. With three principled receptive field attention, each node in GAMLP is flexible and adaptive in leveraging the propagated features over the different sizes of reception field. We conduct extensive evaluations on the three large open graph benchmarks (e.g., ogbn-papers100M, ogbn-products and ogbn-mag), demonstrating that GAMLP not only achieves the state-of-art performance, but also additionally provide high scalability and efficiency.
308 - Tao Zhang , Ling Zhang , Ruyi Xie 2021
The extending structures and unified products for Malcev algebras are developed. Some special cases of unified products such as crossed products and matched pair of Malcev algebras are studied. It is proved that the extending structures can be classi fied by some non-abelian cohomology theory. One dimensional flag extending structures of Malcev algebras are also investigated.
With the continuous improvement of the performance of object detectors via advanced model architectures, imbalance problems in the training process have received more attention. It is a common paradigm in object detection frameworks to perform multi- scale detection. However, each scale is treated equally during training. In this paper, we carefully study the objective imbalance of multi-scale detector training. We argue that the loss in each scale level is neither equally important nor independent. Different from the existing solutions of setting multi-task weights, we dynamically optimize the loss weight of each scale level in the training process. Specifically, we propose an Adaptive Variance Weighting (AVW) to balance multi-scale loss according to the statistical variance. Then we develop a novel Reinforcement Learning Optimization (RLO) to decide the weighting scheme probabilistically during training. The proposed dynamic methods make better utilization of multi-scale training loss without extra computational complexity and learnable parameters for backpropagation. Experiments show that our approaches can consistently boost the performance over various baseline detectors on Pascal VOC and MS COCO benchmark.
140 - Wentao Zhang 2021
This paper studies coordination problem for time-varying networks suffering from antagonistic information, quantified by scaling parameters. By such a manner, interacting property of the participating individuals and antagonistic information can be q uantified in a fully decoupled perspective, thus benefiting from merely directed spanning tree hypothesis is needed, in the sense of usual algebraic graph theory. We start with rigorous argument on the existence of weighted gain, and then derive relation among weighted gain, scaling parameter and Laplacian matrix guaranteeing antagonistic information cannot diverge system state. Based on these arguments, we devise coordination algorithm constrained by topology-dependent average time condition, thus relaxing the examination of directed spanning tree requirement for the union graph that is usually intractable. Moreover, the induced theoretical results are applied to time-varying networks with several mutually uninfluenced agents, in accompanying with some discussions and comparisons with respect to the existing developments.
Modern processors have suffered a deluge of danger- ous side channel and speculative execution attacks that exploit vulnerabilities rooted in branch predictor units (BPU). Many such attacks exploit the shared use of the BPU between un- related proces ses, which allows malicious processes to retrieve sensitive data or enable speculative execution attacks. Attacks that exploit collisions between different branch instructions inside the BPU are among the most dangerous. Various protections and mitigations are proposed such as CPU microcode updates, secured cache designs, fencing mechanisms, invisible speculations. While some effectively mitigate speculative execution attacks, they overlook BPU as an attack vector, leaving BPU prone to malicious collisions and resulting critical penalty such as advanced micro-op cache attacks. Furthermore, some mitigations severely hamper the accuracy of the BPU resulting in increased CPU performance overhead. To address these, we present the secret token branch predictor unit (STBPU), a branch predictor design that mitigates collision-based speculative execution attacks and BPU side channel whilst incurring little to no performance overhead. STBPU achieves this by customizing inside data representations for each software entity requiring isolation. To prevent more advanced attacks, STBPU monitors hardware events and preemptively changes how STBPU data is stored and interpreted.
Graph Neural Networks (GNNs) have already been widely applied in various graph mining tasks. However, they suffer from the shallow architecture issue, which is the key impediment that hinders the model performance improvement. Although several releva nt approaches have been proposed, none of the existing studies provides an in-depth understanding of the root causes of performance degradation in deep GNNs. In this paper, we conduct the first systematic experimental evaluation to present the fundamental limitations of shallow architectures. Based on the experimental results, we answer the following two essential questions: (1) what actually leads to the compromised performance of deep GNNs; (2) when we need and how to build deep GNNs. The answers to the above questions provide empirical insights and guidelines for researchers to design deep and well-performed GNNs. To show the effectiveness of our proposed guidelines, we present Deep Graph Multi-Layer Perceptron (DGMLP), a powerful approach (a paradigm in its own right) that helps guide deep GNN designs. Experimental results demonstrate three advantages of DGMLP: 1) high accuracy -- it achieves state-of-the-art node classification performance on various datasets; 2) high flexibility -- it can flexibly choose different propagation and transformation depths according to graph size and sparsity; 3) high scalability and efficiency -- it supports fast training on large-scale graphs. Our code is available in https://github.com/zwt233/DGMLP.
Data selection methods, such as active learning and core-set selection, are useful tools for improving the data efficiency of deep learning models on large-scale datasets. However, recent deep learning models have moved forward from independent and i dentically distributed data to graph-structured data, such as social networks, e-commerce user-item graphs, and knowledge graphs. This evolution has led to the emergence of Graph Neural Networks (GNNs) that go beyond the models existing data selection methods are designed for. Therefore, we present Grain, an efficient framework that opens up a new perspective through connecting data selection in GNNs with social influence maximization. By exploiting the common patterns of GNNs, Grain introduces a novel feature propagation concept, a diversified influence maximization objective with novel influence and diversity functions, and a greedy algorithm with an approximation guarantee into a unified framework. Empirical studies on public datasets demonstrate that Grain significantly improves both the performance and efficiency of data selection (including active learning and core-set selection) for GNNs. To the best of our knowledge, this is the first attempt to bridge two largely parallel threads of research, data selection, and social influence maximization, in the setting of GNNs, paving new ways for improving data efficiency.
Graph neural networks (GNNs) have been widely used in many graph-based tasks such as node classification, link prediction, and node clustering. However, GNNs gain their performance benefits mainly from performing the feature propagation and smoothing across the edges of the graph, thus requiring sufficient connectivity and label information for effective propagation. Unfortunately, many real-world networks are sparse in terms of both edges and labels, leading to sub-optimal performance of GNNs. Recent interest in this sparse problem has focused on the self-training approach, which expands supervised signals with pseudo labels. Nevertheless, the self-training approach inherently cannot realize the full potential of refining the learning performance on sparse graphs due to the unsatisfactory quality and quantity of pseudo labels. In this paper, we propose ROD, a novel reception-aware online knowledge distillation approach for sparse graph learning. We design three supervision signals for ROD: multi-scale reception-aware graph knowledge, task-based supervision, and rich distilled knowledge, allowing online knowledge transfer in a peer-teaching manner. To extract knowledge concealed in the multi-scale reception fields, ROD explicitly requires individual student models to preserve different levels of locality information. For a given task, each student would predict based on its reception-scale knowledge, while simultaneously a strong teacher is established on-the-fly by combining multi-scale knowledge. Our approach has been extensively evaluated on 9 datasets and a variety of graph-based tasks, including node classification, link prediction, and node clustering. The result demonstrates that ROD achieves state-of-art performance and is more robust for the graph sparsity.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا