Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

MWQ: Multiscale Wavelet Quantized Neural Networks

233 0 0.0 ( 0 )

Download Cite

Added by Qigong Sun

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Qigong Sun - Yan Ren - Licheng Jiao

Computer Vision and Pattern Recognition Hardware Architecture

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Model quantization can reduce the model size and computational latency, it has become an essential technique for the deployment of deep neural networks on resourceconstrained hardware (e.g., mobile phones and embedded devices). The existing quantization methods mainly consider the numerical elements of the weights and activation values, ignoring the relationship between elements. The decline of representation ability and information loss usually lead to the performance degradation. Inspired by the characteristics of images in the frequency domain, we propose a novel multiscale wavelet quantization (MWQ) method. This method decomposes original data into multiscale frequency components by wavelet transform, and then quantizes the components of different scales, respectively. It exploits the multiscale frequency and spatial information to alleviate the information loss caused by quantization in the spatial domain. Because of the flexibility of MWQ, we demonstrate three applications (e.g., model compression, quantized network optimization, and information enhancement) on the ImageNet and COCO datasets. Experimental results show that our method has stronger representation ability and can play an effective role in quantized neural networks.

rate research

Wavelet based edge feature enhancement for convolutional neural networks

55 - D. D. N. De Silva , S. Fernando , I. T. S. Piyatilake 2018

Convolutional neural networks are able to perform a hierarchical learning process starting with local features. However, a limited attention is paid to enhancing such elementary level features like edges. We propose and evaluate two wavelet-based edge feature enhancement methods to preprocess the input images to convolutional neural networks. The first method develops feature enhanced representations by decomposing the input images using wavelet transform and limited reconstructing subsequently. The second method develops such feature enhanced inputs to the network using local modulus maxima of wavelet coefficients. For each method, we have developed a new preprocessing layer by implementing each purposed method and have appended to the network architecture. Our empirical evaluations demonstrate that the proposed methods are outperforming the baselines and previously published work with significant accuracy gains.

Computer Vision and Pattern Recognition

Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch

237 - Aojun Zhou , Yukun Ma , Junnan Zhu 2021

Sparsity in Deep Neural Networks (DNNs) has been widely studied to compress and accelerate the models on resource-constrained environments. It can be generally categorized into unstructured fine-grained sparsity that zeroes out multiple individual weights distributed across the neural network, and structured coarse-grained sparsity which prunes blocks of sub-networks of a neural network. Fine-grained sparsity can achieve a high compression ratio but is not hardware friendly and hence receives limited speed gains. On the other hand, coarse-grained sparsity cannot concurrently achieve both apparent acceleration on modern GPUs and decent performance. In this paper, we are the first to study training from scratch an N:M fine-grained structured sparse network, which can maintain the advantages of both unstructured fine-grained sparsity and structured coarse-grained sparsity simultaneously on specifically designed GPUs. Specifically, a 2:4 sparse network could achieve 2x speed-up without performance drop on Nvidia A100 GPUs. Furthermore, we propose a novel and effective ingredient, sparse-refined straight-through estimator (SR-STE), to alleviate the negative influence of the approximated gradients computed by vanilla STE during optimization. We also define a metric, Sparse Architecture Divergence (SAD), to measure the sparse networks topology change during the training process. Finally, We justify SR-STEs advantages with SAD and demonstrate the effectiveness of SR-STE by performing comprehensive experiments on various tasks. Source codes and models are available at https://github.com/NM-sparsity/NM-sparsity.

Computer Vision and Pattern Recognition Hardware Architecture

SparseTrain: Exploiting Dataflow Sparsity for Efficient Convolutional Neural Networks Training

98 - Pengcheng Dai , Jianlei Yang , Xucheng Ye 2020

Training Convolutional Neural Networks (CNNs) usually requires a large number of computational resources. In this paper, textit{SparseTrain} is proposed to accelerate CNN training by fully exploiting the sparsity. It mainly involves three levels of innovations: activation gradients pruning algorithm, sparse training dataflow, and accelerator architecture. By applying a stochastic pruning algorithm on each layer, the sparsity of back-propagation gradients can be increased dramatically without degrading training accuracy and convergence rate. Moreover, to utilize both textit{natural sparsity} (resulted from ReLU or Pooling layers) and textit{artificial sparsity} (brought by pruning algorithm), a sparse-aware architecture is proposed for training acceleration. This architecture supports forward and back-propagation of CNN by adopting 1-Dimensional convolution dataflow. We have built %a simple compiler to map CNNs topology onto textit{SparseTrain}, and a cycle-accurate architecture simulator to evaluate the performance and efficiency based on the synthesized design with $14nm$ FinFET technologies. Evaluation results on AlexNet/ResNet show that textit{SparseTrain} could achieve about $2.7 times$ speedup and $2.2 times$ energy efficiency improvement on average compared with the original training process.

Computer Vision and Pattern Recognition Hardware Architecture

Searching for Low-Bit Weights in Quantized Neural Networks

88 - Zhaohui Yang , Yunhe Wang , Kai Han 2020

Quantized neural networks with low-bit weights and activations are attractive for developing AI accelerators. However, the quantization functions used in most conventional quantization methods are non-differentiable, which increases the optimization difficulty of quantized networks. Compared with full-precision parameters (i.e., 32-bit floating numbers), low-bit values are selected from a much smaller set. For example, there are only 16 possibilities in 4-bit space. Thus, we present to regard the discrete weights in an arbitrary quantized neural network as searchable variables, and utilize a differential method to search them accurately. In particular, each weight is represented as a probability distribution over the discrete value set. The probabilities are optimized during training and the values with the highest probability are selected to establish the desired quantized network. Experimental results on benchmarks demonstrate that the proposed method is able to produce quantized neural networks with higher performance over the state-of-the-art methods on both image classification and super-resolution tasks.

Computer Vision and Pattern Recognition

3DQ: Compact Quantized Neural Networks for Volumetric Whole Brain Segmentation

137 - Magdalini Paschali , Stefano Gasperini , Abhijit Guha Roy 2019

Model architectures have been dramatically increasing in size, improving performance at the cost of resource requirements. In this paper we propose 3DQ, a ternary quantization method, applied for the first time to 3D Fully Convolutional Neural Networks (F-CNNs), enabling 16x model compression while maintaining performance on par with full precision models. We extensively evaluate 3DQ on two datasets for the challenging task of whole brain segmentation. Additionally, we showcase our methods ability to generalize on two common 3D architectures, namely 3D U-Net and V-Net. Outperforming a variety of baselines, the proposed method is capable of compressing large 3D models to a few MBytes, alleviating the storage needs in space critical applications.

Computer Vision and Pattern Recognition

comments

Fetching comments

Oran 1 University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

MWQ: Multiscale Wavelet Quantized Neural Networks

Ask ChatGPT about the research

No Arabic abstract

Read More