ترغب بنشر مسار تعليمي؟ اضغط هنا

MVC-Net: A Convolutional Neural Network Architecture for Manifold-Valued Images With Applications

154   0   0.0 ( 0 )
 نشر من قبل Jose Bouza
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Geometric deep learning has attracted significant attention in recent years, in part due to the availability of exotic data types for which traditional neural network architectures are not well suited. Our goal in this paper is to generalize convolutional neural networks (CNN) to the manifold-valued image case which arises commonly in medical imaging and computer vision applications. Explicitly, the input data to the network is an image where each pixel value is a sample from a Riemannian manifold. To achieve this goal, we must generalize the basic building block of traditional CNN architectures, namely, the weighted combinations operation. To this end, we develop a tangent space combination operation which is used to define a convolution operation on manifold-valued images that we call, the Manifold-Valued Convolution (MVC). We prove theoretical properties of the MVC operation, including equivariance to the action of the isometry group admitted by the manifold and characterizing when compositions of MVC layers collapse to a single layer. We present a detailed description of how to use MVC layers to build full, multi-layer neural networks that operate on manifold-valued images, which we call the MVC-net. Further, we empirically demonstrate superior performance of the MVC-nets in medical imaging and computer vision tasks.

قيم البحث

اقرأ أيضاً

Deep neural networks have become the main work horse for many tasks involving learning from data in a variety of applications in Science and Engineering. Traditionally, the input to these networks lie in a vector space and the operations employed wit hin the network are well defined on vector-spaces. In the recent past, due to technological advances in sensing, it has become possible to acquire manifold-valued data sets either directly or indirectly. Examples include but are not limited to data from omnidirectional cameras on automobiles, drones etc., synthetic aperture radar imaging, diffusion magnetic resonance imaging, elastography and conductance imaging in the Medical Imaging domain and others. Thus, there is need to generalize the deep neural networks to cope with input data that reside on curved manifolds where vector space operations are not naturally admissible. In this paper, we present a novel theoretical framework to generalize the widely popular convolutional neural networks (CNNs) to high dimensional manifold-valued data inputs. We call these networks, ManifoldNets. In ManifoldNets, convolution operation on data residing on Riemannian manifolds is achieved via a provably convergent recursive computation of the weighted Fr{e}chet Mean (wFM) of the given data, where the weights makeup the convolution mask, to be learned. Further, we prove that the proposed wFM layer achieves a contraction mapping and hence ManifoldNet does not need the non-linear ReLU unit used in standard CNNs. We present experiments, using the ManifoldNet framework, to achieve dimensionality reduction by computing the principal linear subspaces that naturally reside on a Grassmannian. The experimental results demonstrate the efficacy of ManifoldNets in the context of classification and reconstruction accuracy.
Recent studies have used deep residual convolutional neural networks (CNNs) for JPEG compression artifact reduction. This study proposes a scalable CNN called S-Net. Our approach effectively adjusts the network scale dynamically in a multitask system for real-time operation with little performance loss. It offers a simple and direct technique to evaluate the performance gains obtained with increasing network depth, and it is helpful for removing redundant network layers to maximize the network efficiency. We implement our architecture using the Keras framework with the TensorFlow backend on an NVIDIA K80 GPU server. We train our models on the DIV2K dataset and evaluate their performance on public benchmark datasets. To validate the generality and universality of the proposed method, we created and utilized a new dataset, called WIN143, for over-processed images evaluation. Experimental results indicate that our proposed approach outperforms other CNN-based methods and achieves state-of-the-art performance.
The novel Coronavirus Disease 2019 (COVID-19) is a global pandemic disease spreading rapidly around the world. A robust and automatic early recognition of COVID-19, via auxiliary computer-aided diagnostic tools, is essential for disease cure and cont rol. The chest radiography images, such as Computed Tomography (CT) and X-ray, and deep Convolutional Neural Networks (CNNs), can be a significant and useful material for designing such tools. However, designing such an automated tool is challenging as a massive number of manually annotated datasets are not publicly available yet, which is the core requirement of supervised learning systems. In this article, we propose a robust CNN-based network, called CVR-Net (Coronavirus Recognition Network), for the automatic recognition of the coronavirus from CT or X-ray images. The proposed end-to-end CVR-Net is a multi-scale-multi-encoder ensemble model, where we have aggregated the outputs from two different encoders and their different scales to obtain the final prediction probability. We train and test the proposed CVR-Net on three different datasets, where the images have collected from different open-source repositories. We compare our proposed CVR-Net with state-of-the-art methods, which are trained and tested on the same datasets. We split three datasets into five different tasks, where each task has a different number of classes, to evaluate the multi-tasking CVR-Net. Our model achieves an overall F1-score & accuracy of 0.997 & 0.998; 0.963 & 0.964; 0.816 & 0.820; 0.961 & 0.961; and 0.780 & 0.780, respectively, for task-1 to task-5. As the CVR-Net provides promising results on the small datasets, it can be an auspicious computer-aided diagnostic tool for the diagnosis of coronavirus to assist the clinical practitioners and radiologists. Our source codes and model are publicly available at https://github.com/kamruleee51/CVR-Net.
A new convolutional neural network (CNN) architecture for 2D driver/passenger pose estimation and seat belt detection is proposed in this paper. The new architecture is more nimble and thus more suitable for in-vehicle monitoring tasks compared to ot her generic pose estimation algorithms. The new architecture, named NADS-Net, utilizes the feature pyramid network (FPN) backbone with multiple detection heads to achieve the optimal performance for driver/passenger state detection tasks. The new architecture is validated on a new data set containing video clips of 100 drivers in 50 driving sessions that are collected for this study. The detection performance is analyzed under different demographic, appearance, and illumination conditions. The results presented in this paper may provide meaningful insights for the autonomous driving research community and automotive industry for future algorithm development and data collection.
Deep neural network (DNN) accelerators with improved energy and delay are desirable for meeting the requirements of hardware targeted for IoT and edge computing systems. Convolutional neural networks (CoNNs) belong to one of the most popular types of DNN architectures. This paper presents the design and evaluation of an accelerator for CoNNs. The system-level architecture is based on mixed-signal, cellular neural networks (CeNNs). Specifically, we present (i) the implementation of different layers, including convolution, ReLU, and pooling, in a CoNN using CeNN, (ii) modified CoNN structures with CeNN-friendly layers to reduce computational overheads typically associated with a CoNN, (iii) a mixed-signal CeNN architecture that performs CoNN computations in the analog and mixed signal domain, and (iv) design space exploration that identifies what CeNN-based algorithm and architectural features fare best compared to existing algorithms and architectures when evaluated over common datasets -- MNIST and CIFAR-10. Notably, the proposed approach can lead to 8.7$times$ improvements in energy-delay product (EDP) per digit classification for the MNIST dataset at iso-accuracy when compared with the state-of-the-art DNN engine, while our approach could offer 4.3$times$ improvements in EDP when compared to other network implementations for the CIFAR-10 dataset.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا