ترغب بنشر مسار تعليمي؟ اضغط هنا

A Tour of Convolutional Networks Guided by Linear Interpreters

298   0   0.0 ( 0 )
 نشر من قبل Pablo Navarrete Michelini
 تاريخ النشر 2019
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Convolutional networks are large linear systems divided into layers and connected by non-linear units. These units are the articulations that allow the network to adapt to the input. To understand how a network manages to solve a problem we must look at the articulated decisions in entirety. If we could capture the actions of non-linear units for a particular input, we would be able to replay the whole system back and forth as if it was always linear. It would also reveal the actions of non-linearities because the resulting linear system, a Linear Interpreter, depends on the input image. We introduce a hooking layer, called a LinearScope, which allows us to run the network and the linear interpreter in parallel. Its implementation is simple, flexible and efficient. From here we can make many curious inquiries: how do these linear systems look like? When the rows and columns of the transformation matrix are images, how do they look like? What type of basis do these linear transformations rely on? The answers depend on the problems presented, through which we take a tour to some popular architectures used for classification, super-resolution (SR) and image-to-image translation (I2I). For classification we observe that popular networks use a pixel-wise vote per class strategy and heavily rely on bias parameters. For SR and I2I we find that CNNs use wavelet-type basis similar to the human visual system. For I2I we reveal copy-move and template-creation strategies to generate outputs.



قيم البحث

اقرأ أيضاً

210 - Orr Shalit 2020
Dilation theory is a paradigm for studying operators by way of exhibiting an operator as a compression of another operator which is in some sense well behaved. For example, every contraction can be dilated to (i.e., is a compression of) a unitary ope rator, and on this simple fact a penetrating theory of non-normal operators has been developed. In the first part of this survey, I will leisurely review key classical results on dilation theory for a single operator or for several commuting operators, and sample applications of dilation theory in operator theory and in function theory. Then, in the second part, I will give a rapid account of a plethora of variants of dilation theory and their applications. In particular, I will discuss dilation theory of completely positive maps and semigroups, as well as the operator algebraic approach to dilation theory. In the last part, I will present relatively new dilation problems in the noncommutative setting which are related to the study of matrix convex sets and operator systems, and are motivated by applications in control theory. These problems include dilating tuples of noncommuting operators to tuples of commuting normal operators with a specified joint spectrum. I will also describe the recently studied problem of determining the optimal constant $c = c_{theta,theta}$, such that every pair of unitaries $U,V$ satisfying $VU = e^{itheta} UV$ can be dilated to a pair of $cU, cV$, where $U,V$ are unitaries that satisfy the commutation relation $VU = e^{itheta} UV$. The solution of this problem gives rise to a new and surprising application of dilation theory to the continuity of the spectrum of the almost Mathieu operator from mathematical physics.
We introduce an approach to training a given compact network. To this end, we leverage over-parameterization, which typically improves both neural network optimization and generalization. Specifically, we propose to expand each linear layer of the co mpact network into multiple consecutive linear layers, without adding any nonlinearity. As such, the resulting expanded network, or ExpandNet, can be contracted back to the compact one algebraically at inference. In particular, we introduce two convolutional expansion strategies and demonstrate their benefits on several tasks, including image classification, object detection, and semantic segmentation. As evidenced by our experiments, our approach outperforms both training the compact network from scratch and performing knowledge distillation from a teacher. Furthermore, our linear over-parameterization empirically reduces gradient confusion during training and improves the network generalization.
Pedestrian trajectory prediction is a critical yet challenging task, especially for crowded scenes. We suggest that introducing an attention mechanism to infer the importance of different neighbors is critical for accurate trajectory prediction in sc enes with varying crowd size. In this work, we propose a novel method, AVGCN, for trajectory prediction utilizing graph convolutional networks (GCN) based on human attention (A denotes attention, V denotes visual field constraints). First, we train an attention network that estimates the importance of neighboring pedestrians, using gaze data collected as subjects perform a birds eye view crowd navigation task. Then, we incorporate the learned attention weights modulated by constraints on the pedestrians visual field into a trajectory prediction network that uses a GCN to aggregate information from neighbors efficiently. AVGCN also considers the stochastic nature of pedestrian trajectories by taking advantage of variational trajectory prediction. Our approach achieves state-of-the-art performance on several trajectory prediction benchmarks, and the lowest average prediction error over all considered benchmarks.
Image registration and in particular deformable registration methods are pillars of medical imaging. Inspired by the recent advances in deep learning, we propose in this paper, a novel convolutional neural network architecture that couples linear and deformable registration within a unified architecture endowed with near real-time performance. Our framework is modular with respect to the global transformation component, as well as with respect to the similarity function while it guarantees smooth displacement fields. We evaluate the performance of our network on the challenging problem of MRI lung registration, and demonstrate superior performance with respect to state of the art elastic registration methods. The proposed deformation (between inspiration & expiration) was considered within a clinically relevant task of interstitial lung disease (ILD) classification and showed promising results.
228 - Chi Li , Yuchen Liu , Chenyang Xu 2018
This is a survey on the recent theory on minimizing the normalized volume function attached to any klt singularities.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا