ﻻ يوجد ملخص باللغة العربية
We propose GrateTile, an efficient, hardwarefriendly data storage scheme for sparse CNN feature maps (activations). It divides data into uneven-sized subtensors and, with small indexing overhead, stores them in a compressed yet randomly accessible format. This design enables modern CNN accelerators to fetch and decompressed sub-tensors on-the-fly in a tiled processing manner. GrateTile is suitable for architectures that favor aligned, coalesced data access, and only requires minimal changes to the overall architectural design. We simulate GrateTile with state-of-the-art CNNs and show an average of 55% DRAM bandwidth reduction while using only 0.6% of feature map size for indexing storage.
Convolutional neural network (CNN) inference on mobile devices demands efficient hardware acceleration of low-precision (INT8) general matrix multiplication (GEMM). Exploiting data sparsity is a common approach to further accelerate GEMM for CNN infe
Sparse neural networks can greatly facilitate the deployment of neural networks on resource-constrained platforms as they offer compact model sizes while retaining inference accuracy. Because of the sparsity in parameter matrices, sparse neural netwo
Training deep learning networks involves continuous weight updates across the various layers of the deep network while using a backpropagation algorithm (BP). This results in expensive computation overheads during training. Consequently, most deep le
Tensor completion refers to the task of estimating the missing data from an incomplete measurement or observation, which is a core problem frequently arising from the areas of big data analysis, computer vision, and network engineering. Due to the mu
Accurate capacitance extraction is becoming more important for designing integrated circuits under advanced process technology. The pattern matching based full-chip extraction methodology delivers fast computational speed, but suffers from large erro