أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Yukun Yang

Backpropagated Neighborhood Aggregation for Accurate Training of Spiking Neural Networks

148 - Yukun Yang , Wenrui Zhang , Peng Li 2021

While backpropagation (BP) has been applied to spiking neural networks (SNNs) achieving encouraging results, a key challenge involved is to backpropagate a continuous-valued loss over layers of spiking neurons exhibiting discontinuous all-or-none fir ing activities. Existing methods deal with this difficulty by introducing compromises that come with their own limitations, leading to potential performance degradation. We propose a novel BP-like method, called neighborhood aggregation (NA), which computes accurate error gradients guiding weight updates that may lead to discontinuous modifications of firing activities. NA achieves this goal by aggregating finite differences of the loss over multiple perturbed membrane potential waveforms in the neighborhood of the present membrane potential of each neuron while utilizing a new membrane potential distance function. Our experiments show that the proposed NA algorithm delivers the state-of-the-art performance for SNN training on several datasets.

الحوسبة العصبية والتطورية

Temporal Surrogate Back-propagation for Spiking Neural Networks

152 - Yukun Yang 2020

Spiking neural networks (SNN) are usually more energy-efficient as compared to Artificial neural networks (ANN), and the way they work has a great similarity with our brain. Back-propagation (BP) has shown its strong power in training ANN in recent y ears. However, since spike behavior is non-differentiable, BP cannot be applied to SNN directly. Although prior works demonstrated several ways to approximate the BP-gradient in both spatial and temporal directions either through surrogate gradient or randomness, they omitted the temporal dependency introduced by the reset mechanism between each step. In this article, we target on theoretical completion and investigate the effect of the missing term thoroughly. By adding the temporal dependency of the reset mechanism, the new algorithm is more robust to learning-rate adjustments on a toy dataset but does not show much improvement on larger learning tasks like CIFAR-10. Empirically speaking, the benefits of the missing term are not worth the additional computational overhead. In many cases, the missing term can be ignored.

الحوسبة العصبية والتطورية التعلم الآلي

The Lyman-alpha Emission in Solar Flares. I. a Statistical Study on Its Relationship with the 1--8 AA Soft X-ray Emission

46 - Zhichen Jing , Wuqi Pan , Yukun Yang 2020

We statistically study the relationship between the Lyman-alpha (lya) and 1--8 AA soft X-ray (SXR) emissions from 658 M- and X-class solar flares observed by the {em Geostationary Operational Environmental Satellite} during 2006--2016. Based on the p eak times of the two waveband emissions, we divide the flares into three types. Type I (III) has an earlier (a later) peak time in the lya emission than that in the SXR emission, while type II has nearly a same peak time (within the time resolution of 10 s) between the lya and SXR emissions. In these 658 flares, we find that there are 505 (76.8%) type I flares, 10 (1.5%) type II flares, and 143 (21.7%) type III flares, and that the three types appear to have no dependence on the flare duration, flare location, or solar cycle. Besides the main peak, the lya emission of the three type flares also shows sub-peaks which can appear in the impulsive or gradual phase of the flare. It is found that the main-peak (for type I) and sub-peak (for type III) emissions of lya that appear in the impulsive phase follow the Neupert effect in general. This indicates that such lya emissions are related to the nonthermal electron beam heating. While the main-peak (for type III) and sub-peak (for type I) emissions of lya that appear in the gradual phase are supposed to be primarily contributed by the thermal plasma that cools down.

الفيزياء الفلكية الشمسية والنجوم

Defending Neural Backdoors via Generative Distribution Modeling

69 - Ximing Qiao , Yukun Yang , Hai Li 2019

Neural backdoor attack is emerging as a severe security threat to deep learning, while the capability of existing defense methods is limited, especially for complex backdoor triggers. In the work, we explore the space formed by the pixel values of al l possible backdoor triggers. An original trigger used by an attacker to build the backdoored model represents only a point in the space. It then will be generalized into a distribution of valid triggers, all of which can influence the backdoored model. Thus, previous methods that model only one point of the trigger distribution is not sufficient. Getting the entire trigger distribution, e.g., via generative modeling, is a key to effective defense. However, existing generative modeling techniques for image generation are not applicable to the backdoor scenario as the trigger distribution is completely unknown. In this work, we propose max-entropy staircase approximator (MESA), an algorithm for high-dimensional sampling-free generative modeling and use it to recover the trigger distribution. We also develop a defense technique to remove the triggers from the backdoored model. Our experiments on Cifar10/100 dataset demonstrate the effectiveness of MESA in modeling the trigger distribution and the robustness of the proposed defense method.

التعلم الآلي التعلم الالي

SwiftNet: Using Graph Propagation as Meta-knowledge to Search Highly Representative Neural Architectures

81 - Hsin-Pai Cheng , Tunhou Zhang , Yukun Yang 2019

Designing neural architectures for edge devices is subject to constraints of accuracy, inference latency, and computational cost. Traditionally, researchers manually craft deep neural networks to meet the needs of mobile devices. Neural Architecture Search (NAS) was proposed to automate the neural architecture design without requiring extensive domain expertise and significant manual efforts. Recent works utilized NAS to design mobile models by taking into account hardware constraints and achieved state-of-the-art accuracy with fewer parameters and less computational cost measured in Multiply-accumulates (MACs). To find highly compact neural architectures, existing works relies on predefined cells and directly applying width multiplier, which may potentially limit the model flexibility, reduce the useful feature map information, and cause accuracy drop. To conquer this issue, we propose GRAM(GRAph propagation as Meta-knowledge) that adopts fine-grained (node-wise) search method and accumulates the knowledge learned in updates into a meta-graph. As a result, GRAM can enable more flexible search space and achieve higher search efficiency. Without the constraints of predefined cell or blocks, we propose a new structure-level pruning method to remove redundant operations in neural architectures. SwiftNet, which is a set of models discovered by GRAM, outperforms MobileNet-V2 by 2.15x higher accuracy density and 2.42x faster with similar accuracy. Compared with FBNet, SwiftNet reduces the search cost by 26x and achieves 2.35x higher accuracy density and 1.47x speedup while preserving similar accuracy. SwiftNetcan obtain 63.28% top-1 accuracy on ImageNet-1K with only 53M MACs and 2.07M parameters. The corresponding inference latency is only 19.09 ms on Google Pixel 1.

التعلم الآلي الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد