NN2CAM: Automated Neural Network Mapping for Multi-Precision Edge Processing on FPGA-Based Cameras

45 0 0.0 ( 0 )

Download Cite

Added by Petar Jokic

Publication date 2021

fields Electronic Engineering

and research's language is English

Authors Petar Jokic - Stephane Emery - Luca Benini

Image and Video Processing

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The record-breaking achievements of deep neural networks (DNNs) in image classification and detection tasks resulted in a surge of new computer vision applications during the past years. However, their computational complexity is restricting their deployment to powerful stationary or complex dedicated processing hardware, limiting their use in smart edge processing applications. We propose an automated deployment framework for DNN acceleration at the edge on field-programmable gate array (FPGA)-based cameras. The framework automatically converts an arbitrary-sized and quantized trained network into an efficient streaming-processing IP block that is instantiated within a generic adapter block in the FPGA. In contrast to prior work, the accelerator is purely logic and thus supports end-to-end processing on FPGAs without on-chip microprocessors. Our mapping tool features automatic translation from a trained Caffe network, arbitrary layer-wise fixed-point precision for both weights and activations, an efficient XNOR implementation for fully binary layers as well as a balancing mechanism for effective allocation of computational resources in the streaming dataflow. To present the performance of the system we employ this tool to implement two CNN edge processing networks on an FPGA-based high-speed camera with various precision settings showing computational throughputs of up to 337GOPS in low-latency streaming mode (no batching), running entirely on the camera.

rate research

Attention Based Image Compression Post-Processing Convolutional Neural Network

106 - Yuyang Xue , Jiannan Su 2019

The traditional image compressors, e.g., BPG and H.266, have achieved great image and video compression quality. Recently, Convolutional Neural Network has been used widely in image compression. We proposed an attention-based convolutional neural network for low bit-rate compression to post-process the output of traditional image compression decoder. Across the experimental results on validation sets, the post-processing module trained by MAE and MS-SSIM losses yields the highest PSNR of 32.10 on average at the bit-rate of 0.15.

Image and Video Processing Computer Vision and Pattern Recognition Machine Learning

Fixed-Point Convolutional Neural Network for Real-Time Video Processing in FPGA

71 - Roman Solovyev , Alexander Kustov , Dmitry Telpukhov 2018

Modern mobile neural networks with a reduced number of weights and parameters do a good job with image classification tasks, but even they may be too complex to be implemented in an FPGA for video processing tasks. The article proposes neural network architecture for the practical task of recognizing images from a camera, which has several advantages in terms of speed. This is achieved by reducing the number of weights, moving from a floating-point to a fixed-point arithmetic, and due to a number of hardware-level optimizations associated with storing weights in blocks, a shift register, and an adjustable number of convolutional blocks that work in parallel. The article also proposed methods for adapting the existing data set for solving a different task. As the experiments showed, the proposed neural network copes well with real-time video processing even on the cheap FPGAs.

Computer Vision and Pattern Recognition

Automated Multi-Channel Segmentation for the 4D Myocardial Velocity Mapping Cardiac MR

90 - Yinzhe Wu , Suzan Hatipoglu , Diego Alonso-Alvarez 2020

Four-dimensional (4D) left ventricular myocardial velocity mapping (MVM) is a cardiac magnetic resonance (CMR) technique that allows assessment of cardiac motion in three orthogonal directions. Accurate and reproducible delineation of the myocardium is crucial for accurate analysis of peak systolic and diastolic myocardial velocities. In addition to the conventionally available magnitude CMR data, 4D MVM also acquires three velocity-encoded phase datasets which are used to generate velocity maps. These can be used to facilitate and improve myocardial delineation. Based on the success of deep learning in medical image processing, we propose a novel automated framework that improves the standard U-Net based methods on these CMR multi-channel data (magnitude and phase) by cross-channel fusion with attention module and shape information based post-processing to achieve accurate delineation of both epicardium and endocardium contours. To evaluate the results, we employ the widely used Dice scores and the quantification of myocardial longitudinal peak velocities. Our proposed network trained with multi-channel data shows enhanced performance compared to standard U-Net based networks trained with single-channel data. Based on the results, our method provides compelling evidence for the design and application for the multi-channel image analysis of the 4D MVM CMR data.

Image and Video Processing Computer Vision and Pattern Recognition

Binary Complex Neural Network Acceleration on FPGA

113 - Hongwu Peng , Shanglin Zhou , Scott Weitze 2021

Being able to learn from complex data with phase information is imperative for many signal processing applications. Today s real-valued deep neural networks (DNNs) have shown efficiency in latent information analysis but fall short when applied to the complex domain. Deep complex networks (DCN), in contrast, can learn from complex data, but have high computational costs; therefore, they cannot satisfy the instant decision-making requirements of many deployable systems dealing with short observations or short signal bursts. Recent, Binarized Complex Neural Network (BCNN), which integrates DCNs with binarized neural networks (BNN), shows great potential in classifying complex data in real-time. In this paper, we propose a structural pruning based accelerator of BCNN, which is able to provide more than 5000 frames/s inference throughput on edge devices. The high performance comes from both the algorithm and hardware sides. On the algorithm side, we conduct structural pruning to the original BCNN models and obtain 20 $times$ pruning rates with negligible accuracy loss; on the hardware side, we propose a novel 2D convolution operation accelerator for the binary complex neural network. Experimental results show that the proposed design works with over 90% utilization and is able to achieve the inference throughput of 5882 frames/s and 4938 frames/s for complex NIN-Net and ResNet-18 using CIFAR-10 dataset and Alveo U280 Board.

Machine Learning

Quantitative Susceptibility Mapping using Deep Neural Network: QSMnet

85 - Jaeyeon Yoon , Enhao Gong , Itthi Chatnuntawech 2018

Deep neural networks have demonstrated promising potential for the field of medical image reconstruction. In this work, an MRI reconstruction algorithm, which is referred to as quantitative susceptibility mapping (QSM), has been developed using a deep neural network in order to perform dipole deconvolution, which restores magnetic susceptibility source from an MRI field map. Previous approaches of QSM require multiple orientation data (e.g. Calculation of Susceptibility through Multiple Orientation Sampling or COSMOS) or regularization terms (e.g. Truncated K-space Division or TKD; Morphology Enabled Dipole Inversion or MEDI) to solve the ill-conditioned deconvolution problem. Unfortunately, they either require long multiple orientation scans or suffer from artifacts. To overcome these shortcomings, a deep neural network, QSMnet, is constructed to generate a high quality susceptibility map from single orientation data. The network has a modified U-net structure and is trained using gold-standard COSMOS QSM maps. 25 datasets from 5 subjects (5 orientation each) were applied for patch-wise training after doubling the data using augmentation. Two additional datasets of 5 orientation data were used for validation and test (one dataset each). The QSMnet maps of the test dataset were compared with those from TKD and MEDI for image quality and consistency in multiple head orientations. Quantitative and qualitative image quality comparisons demonstrate that the QSMnet results have superior image quality to those of TKD or MEDI and have comparable image quality to those of COSMOS. Additionally, QSMnet maps reveal substantially better consistency across the multiple orientations than those from TKD or MEDI. As a preliminary application, the network was tested for two patients. The QSMnet maps showed similar lesion contrasts with those from MEDI, demonstrating potential for future applications.

Image and Video Processing