Previous works proved that the combination of the linear neuron network with nonlinear activation functions (e.g. ReLu) can achieve nonlinear function approximation. However, simply widening or deepening the network structure will introduce some training problems. In this work, we are aiming to build a comprehensive second-order CNN implementation framework that includes neuron/network design and system deployment optimization.
This paper presents GPU performance optimization and scaling results for inference models of the Sparse Deep Neural Network Challenge 2020. Demands for network quality have increased rapidly, pushing the size and thus the memory requirements of many neural networks beyond the capacity of available accelerators. Sparse deep neural networks (SpDNN) have shown promise for reining in the memory footprint of large neural networks. However, there is room for improvement in implementing SpDNN operations on GPUs. This work presents optimized sparse matrix multiplication kernels fused with the ReLU function. The optimized kernels reuse input feature maps from the shared memory and sparse weights from registers. For multi-GPU parallelism, our SpDNN implementation duplicates weights and statically partition the feature maps across GPUs. Results for the challenge benchmarks show that the proposed kernel design and multi-GPU parallelization achieve up to 180 tera-edges per second inference throughput. These results are up to 4.3x faster for a single GPU and an order of magnitude faster at full scale than those of the champion of the 2019 Sparse Deep Neural Network Graph Challenge for the same generation of NVIDIA V100 GPUs. Using the same implementation, we also show single-GPU throughput on NVIDIA A100 is 2.37$times$ faster than V100.
Recent works in neural network verification show that cheap incomplete verifiers such as CROWN, based upon bound propagations, can effectively be used in Branch-and-Bound (BaB) methods to accelerate complete verification, achieving significant speedups compared to expensive linear programming (LP) based techniques. However, they cannot fully handle the per-neuron split constraints introduced by BaB like LP verifiers do, leading to looser bounds and hurting their verification efficiency. In this work, we develop $beta$-CROWN, a new bound propagation based method that can fully encode per-neuron splits via optimizable parameters $beta$. When the optimizable parameters are jointly optimized in intermediate layers, $beta$-CROWN has the potential of producing better bounds than typical LP verifiers with neuron split constraints, while being efficiently parallelizable on GPUs. Applied to the complete verification setting, $beta$-CROWN is close to three orders of magnitude faster than LP-based BaB methods for robustness verification, and also over twice faster than state-of-the-art GPU-based complete verifiers with similar timeout rates. By terminating BaB early, our method can also be used for incomplete verification. Compared to the state-of-the-art semidefinite-programming (SDP) based verifier, we show a substantial leap forward by greatly reducing the gap between verified accuracy and empirical adversarial attack accuracy, from 35% (SDP) to 12% on an adversarially trained MNIST network ($epsilon=0.3$), while being 47 times faster. Our code is available at https://github.com/KaidiXu/Beta-CROWN
This paper presents the design and implementation of signaling splitting scheme in hyper-cellular network on a software defined radio platform. Hyper-cellular network is a novel architecture of future mobile communication systems in which signaling and data are decoupled at the air interface to mitigate the signaling overhead and allow energy efficient operation of base stations. On an open source software defined radio platform, OpenBTS, we investigate the feasibility of signaling splitting for GSM protocol and implement a novel system which can prove the proposed concept. Standard GSM handsets can camp on the network with the help of signaling base station, and data base station will be appointed to handle phone calls on demand. Our work initiates the systematic approach to study hyper-cellular concept in real wireless environment with both software and hardware implementations.
Network Function Virtualization (NFV) and Service Function Chaining (SFC) have been widely used to enable flexible and agile network management. To enhance reliability, some research has proposed to deploy backup function instances for prompt recovery when a primary instance fails. While most of the recent studies focus on speeding up recovery, less attention has been paid to the problem of minimizing the state update cost. In this work, we present PiggyBackup (Piggyback-based Backup), an efficient backup instance deployment and update protocol. Our key idea is to reuse the existing service chains traversing through servers in a network to help piggyback the update information. By doing this, we eliminate the header overhead and reduce the amount of update traffic significantly. To realize such a piggyback-based update more efficiently, we investigate the backup instance deployment and chain selection problems to enhance piggybacking opportunities and reduce the forwarding hop counts with explicit consideration of the distribution of service chains. Our simulation results show that PiggyBackup reduces the average overall update overhead by 47.65% and 39.56%, respectively, in a fat-tree topology as compared to random deployment and shortest path based deployment.
Sensors used in applications such as agriculture, weather, etc., monitoring physical parameters like soil moisture, temperature, humidity, will have to sustain their battery power for long intervals of time. In order to accomplish this, parameter which assists in reducing the consumption of power from battery need to be attended to. One of the factors affecting the consumption of energy is transmit and receive power. This energy consumption can be reduced by avoiding unnecessary transmission and reception. Efficient routing techniques and incorporating aggregation whenever possible can save considerable amount of energy. Aggregation reduces repeated transmission of relative values and also reduces lot of computation at the base station. In this paper, the benefits of aggregation over direct transmission in saving the amount of energy consumed is discussed. Routing techniques which assist aggregation are incorporated. Aspects like transmission of average value of sensed data around an area of the network, minimum value in the whole of the network, triggering of event when there is low battery are assimilated.