Do you want to publish a course? Click here

Analysis and Design of Convolutional Networks via Hierarchical Tensor Decompositions

111   0   0.0 ( 0 )
 Added by Nadav Cohen
 Publication date 2017
and research's language is English




Ask ChatGPT about the research

The driving force behind convolutional networks - the most successful deep learning architecture to date, is their expressive power. Despite its wide acceptance and vast empirical evidence, formal analyses supporting this belief are scarce. The primary notions for formally reasoning about expressiveness are efficiency and inductive bias. Expressive efficiency refers to the ability of a network architecture to realize functions that require an alternative architecture to be much larger. Inductive bias refers to the prioritization of some functions over others given prior knowledge regarding a task at hand. In this paper we overview a series of works written by the authors, that through an equivalence to hierarchical tensor decompositions, analyze the expressive efficiency and inductive bias of various convolutional network architectural features (depth, width, strides and more). The results presented shed light on the demonstrated effectiveness of convolutional networks, and in addition, provide new tools for network design.



rate research

Read More

The driving force behind deep networks is their ability to compactly represent rich classes of functions. The primary notion for formally reasoning about this phenomenon is expressive efficiency, which refers to a situation where one network must grow unfeasibly large in order to realize (or approximate) functions of another. To date, expressive efficiency analyses focused on the architectural feature of depth, showing that deep networks are representationally superior to shallow ones. In this paper we study the expressive efficiency brought forth by connectivity, motivated by the observation that modern networks interconnect their layers in elaborate ways. We focus on dilated convolutional networks, a family of deep models delivering state of the art performance in sequence processing tasks. By introducing and analyzing the concept of mixed tensor decompositions, we prove that interconnecting dilated convolutional networks can lead to expressive efficiency. In particular, we show that even a single connection between intermediate layers can already lead to an almost quadratic gap, which in large-scale settings typically makes the difference between a model that is practical and one that is not. Empirical evaluation demonstrates how the expressive efficiency of connectivity, similarly to that of depth, translates into gains in accuracy. This leads us to believe that expressive efficiency may serve a key role in the development of new tools for deep network design.
In this work, a dense recurrent convolutional neural network (DRCNN) was constructed to detect sleep disorders including arousal, apnea and hypopnea using Polysomnography (PSG) measurement channels provided in the 2018 Physionet challenge database. Our model structure is composed of multiple dense convolutional units (DCU) followed by a bidirectional long-short term memory (LSTM) layer followed by a softmax output layer. The sleep events including sleep stages, arousal regions and multiple types of apnea and hypopnea are manually annotated by experts which enables us to train our proposed network using a multi-task learning mechanism. Three binary cross-entropy loss functions corresponding to sleep/wake, target arousal and apnea-hypopnea/normal detection tasks are summed up to generate our overall network loss function that is optimized using the Adam method. Our model performance was evaluated using two metrics: the area under the precision-recall curve (AUPRC) and the area under the receiver operating characteristic curve (AUROC). To measure our model generalization, 4-fold cross-validation was also performed. For training, our model was applied to full night recording data. Finally, the average AUPRC and AUROC values associated with the arousal detection task were 0.505 and 0.922, respectively on our testing dataset. An ensemble of four models trained on different data folds improved the AUPRC and AUROC to 0.543 and 0.931, respectively. Our proposed algorithm achieved the first place in the official stage of the 2018 Physionet challenge for detecting sleep arousals with AUPRC of 0.54 on the blind testing dataset.
67 - Jinmian Ye , Guangxi Li , Di Chen 2020
Deep neural networks (DNNs) have achieved outstanding performance in a wide range of applications, e.g., image classification, natural language processing, etc. Despite the good performance, the huge number of parameters in DNNs brings challenges to efficient training of DNNs and also their deployment in low-end devices with limited computing resources. In this paper, we explore the correlations in the weight matrices, and approximate the weight matrices with the low-rank block-term tensors. We name the new corresponding structure as block-term tensor layers (BT-layers), which can be easily adapted to neural network models, such as CNNs and RNNs. In particular, the inputs and the outputs in BT-layers are reshaped into low-dimensional high-order tensors with a similar or improved representation power. Sufficient experiments have demonstrated that BT-layers in CNNs and RNNs can achieve a very large compression ratio on the number of parameters while preserving or improving the representation power of the original DNNs.
The conventional classification schemes -- notably multinomial logistic regression -- used in conjunction with convolutional networks (convnets) are classical in statistics, designed without consideration for the usual coupling with convnets, stochastic gradient descent, and backpropagation. In the specific application to supervised learning for convnets, a simple scale-invariant classification stage turns out to be more robust than multinomial logistic regression, appears to result in slightly lower errors on several standard test sets, has similar computational costs, and features precise control over the actual rate of learning. Scale-invariant means that multiplying the input values by any nonzero scalar leaves the output unchanged.
We consider the problem of recovering a low-rank tensor from its noisy observation. Previous work has shown a recovery guarantee with signal to noise ratio $O(n^{lceil K/2 rceil /2})$ for recovering a $K$th order rank one tensor of size $ntimes cdots times n$ by recursive unfolding. In this paper, we first improve this bound to $O(n^{K/4})$ by a much simpler approach, but with a more careful analysis. Then we propose a new norm called the subspace norm, which is based on the Kronecker products of factors obtained by the proposed simple estimator. The imposed Kronecker structure allows us to show a nearly ideal $O(sqrt{n}+sqrt{H^{K-1}})$ bound, in which the parameter $H$ controls the blend from the non-convex estimator to mode-wise nuclear norm minimization. Furthermore, we empirically demonstrate that the subspace norm achieves the nearly ideal denoising performance even with $H=O(1)$.

suggested questions

comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا