Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Compressing Deep ODE-Nets using Basis Function Expansions

131 0 0.0 ( 0 )

Download Cite

Added by N. Benjamin Erichson

Publication date 2021

fields Informatics Engineering Mathematical Statistics

and research's language is English

Authors Alejandro Queiruga - N. Benjamin Erichson - Liam Hodgkinson

Machine Learning Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The recently-introduced class of ordinary differential equation networks (ODE-Nets) establishes a fruitful connection between deep learning and dynamical systems. In this work, we reconsider formulations of the weights as continuous-depth functions using linear combinations of basis functions. This perspective allows us to compress the weights through a change of basis, without retraining, while maintaining near state-of-the-art performance. In turn, both inference time and the memory footprint are reduced, enabling quick and rigorous adaptation between computational environments. Furthermore, our framework enables meaningful continuous-in-time batch normalization layers using function projections. The performance of basis function compression is demonstrated by applying continuous-depth models to (a) image classification tasks using convolutional units and (b) sentence-tagging tasks using transformer encoder units.

rate research

EXP: N-body integration using basis function expansions

65 - Michael S. Petersen , Martin D. Weinberg , Neal Katz 2021

We present the N-body simulation techniques in EXP. EXP uses empirically-chosen basis functions to expand the potential field of an ensemble of particles. Unlike other basis function expansions, the derived basis functions are adapted to an input mass distribution, enabling accurate expansion of highly non-spherical objects, such as galactic discs. We measure the force accuracy in three models, one based on a spherical or aspherical halo, one based on an exponential disc, and one based on a bar-based disc model. We find that EXP is as accurate as a direct-summation or tree-based calculation, and in some ways is better, while being considerably less computationally intensive. We discuss optimising the computation of the basis function representation. We also detail numerical improvements for performing orbit integrations, including timesteps.

Astrophysics of Galaxies Instrumentation and Methods for Astrophysics

Nonlinear Matrix Approximation with Radial Basis Function Components

268 - Elizaveta Rebrova , Yu-Hang Tang 2021

We introduce and investigate matrix approximation by decomposition into a sum of radial basis function (RBF) components. An RBF component is a generalization of the outer product between a pair of vectors, where an RBF function replaces the scalar multiplication between individual vector elements. Even though the RBF functions are positive definite, the summation across components is not restricted to convex combinations and allows us to compute the decomposition for any real matrix that is not necessarily symmetric or positive definite. We formulate the problem of seeking such a decomposition as an optimization problem with a nonlinear and non-convex loss function. Several mode

Machine Learning Machine Learning

Adversarially-Trained Deep Nets Transfer Better: Illustration on Image Classification

102 - Francisco Utrera , Evan Kravitz , N. Benjamin Erichson 2020

Transfer learning has emerged as a powerful methodology for adapting pre-trained deep neural networks on image recognition tasks to new domains. This process consists of taking a neural network pre-trained on a large feature-rich source dataset, freezing the early layers that encode essential generic image properties, and then fine-tuning the last few layers in order to capture specific information related to the target situation. This approach is particularly useful when only limited or weakly labeled data are available for the new task. In this work, we demonstrate that adversarially-trained models transfer better than non-adversarially-trained models, especially if only limited data are available for the new domain task. Further, we observe that adversarial training biases the learnt representations to retaining shapes, as opposed to textures, which impacts the transferability of the source models. Finally, through the lens of influence functions, we discover that transferred adversarially-trained models contain more human-identifiable semantic information, which explains -- at least partly -- why adversarially-trained models transfer better.

Machine Learning Machine Learning

A Function Space View of Bounded Norm Infinite Width ReLU Nets: The Multivariate Case

125 - Greg Ongie , Rebecca Willett , Daniel Soudry 2019

A key element of understanding the efficacy of overparameterized neural networks is characterizing how they represent functions as the number of weights in the network approaches infinity. In this paper, we characterize the norm required to realize a function $f:mathbb{R}^drightarrowmathbb{R}$ as a single hidden-layer ReLU network with an unbounded number of units (infinite width), but where the Euclidean norm of the weights is bounded, including precisely characterizing which functions can be realized with finite norm. This was settled for univariate univariate functions in Savarese et al. (2019), where it was shown that the required norm is determined by the L1-norm of the second derivative of the function. We extend the characterization to multivariate functions (i.e., networks with d input units), relating the required norm to the L1-norm of the Radon transform of a (d+1)/2-power Laplacian of the function. This characterization allows us to show that all functions in Sobolev spaces $W^{s,1}(mathbb{R})$, $sgeq d+1$, can be represented with bounded norm, to calculate the required norm for several specific functions, and to obtain a depth separation result. These results have important implications for understanding generalization performance and the distinction between neural networks and more traditional kernel learning.

Machine Learning Machine Learning

A Comparison between Deep Neural Nets and Kernel Acoustic Models for Speech Recognition

246 - Zhiyun Lu , Dong Guo , Alireza Bagheri Garakani 2016

We study large-scale kernel methods for acoustic modeling and compare to DNNs on performance metrics related to both acoustic modeling and recognition. Measuring perplexity and frame-level classification accuracy, kernel-based acoustic models are as effective as their DNN counterparts. However, on token-error-rates DNN models can be significantly better. We have discovered that this might be attributed to DNNs unique strength in reducing both the perplexity and the entropy of the predicted posterior probabilities. Motivated by our findings, we propose a new technique, entropy regularized perplexity, for model selection. This technique can noticeably improve the recognition performance of both types of models, and reduces the gap between them. While effective on Broadcast News, this technique could be also applicable to other tasks.

Machine Learning Machine Learning

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Compressing Deep ODE-Nets using Basis Function Expansions

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions