ﻻ يوجد ملخص باللغة العربية
In this work, we propose an effective scheme (called DP-Net) for compressing the deep neural networks (DNNs). It includes a novel dynamic programming (DP) based algorithm to obtain the optimal solution of weight quantization and an optimization process to train a clustering-friendly DNN. Experiments showed that the DP-Net allows larger compression than the state-of-the-art counterparts while preserving accuracy. The largest 77X compression ratio on Wide ResNet is achieved by combining DP-Net with other compression techniques. Furthermore, the DP-Net is extended for compressing a robust DNN model with negligible accuracy loss. At last, a custom accelerator is designed on FPGA to speed up the inference computation with DP-Net.
Routing problems are a class of combinatorial problems with many practical applications. Recently, end-to-end deep learning methods have been proposed to learn approximate solution heuristics for such problems. In contrast, classical dynamic programm
Wavelets are well known for data compression, yet have rarely been applied to the compression of neural networks. This paper shows how the fast wavelet transform can be used to compress linear layers in neural networks. Linear layers still occupy a s
One of the biggest issues in deep learning theory is the generalization ability of networks with huge model size. The classical learning theory suggests that overparameterized models cause overfitting. However, practically used large deep models avoi
Many DNN-enabled vision applications constantly operate under severe energy constraints such as unmanned aerial vehicles, Augmented Reality headsets, and smartphones. Designing DNNs that can meet a stringent energy budget is becoming increasingly imp
Deep neural networks (DNNs) have become the state-of-the-art technique for machine learning tasks in various applications. However, due to their size and the computational complexity, large DNNs are not readily deployable on edge devices in real-time