ترغب بنشر مسار تعليمي؟ اضغط هنا

Nonlinear system identification employing automatic differentiation

70   0   0.0 ( 0 )
 نشر من قبل Jan Schumann-Bischoff
 تاريخ النشر 2015
  مجال البحث فيزياء
والبحث باللغة English




اسأل ChatGPT حول البحث

An optimization based state and parameter estimation method is presented where the required Jacobian matrix of the cost function is computed via automatic differentiation. Automatic differentiation evaluates the programming code of the cost function and provides exact values of the derivatives. In contrast to numerical differentiation it is not suffering from approximation errors and compared to symbolic differentiation it is more convenient to use, because no closed analytic expressions are required. Furthermore, we demonstrate how to generalize the parameter estimation scheme to delay differential equations, where estimating the delay time requires attention.

قيم البحث

اقرأ أيضاً

The successes of deep learning, variational inference, and many other fields have been aided by specialized implementations of reverse-mode automatic differentiation (AD) to compute gradients of mega-dimensional objectives. The AD techniques underlyi ng these tools were designed to compute exact gradients to numerical precision, but modern machine learning models are almost always trained with stochastic gradient descent. Why spend computation and memory on exact (minibatch) gradients only to use them for stochastic optimization? We develop a general framework and approach for randomized automatic differentiation (RAD), which can allow unbiased gradient estimates to be computed with reduced memory in return for variance. We examine limitations of the general approach, and argue that we must leverage problem specific structure to realize benefits. We develop RAD techniques for a variety of simple neural network architectures, and show that for a fixed memory budget, RAD converges in fewer iterations than using a small batch size for feedforward networks, and in a similar number for recurrent networks. We also show that RAD can be applied to scientific computing, and use it to develop a low-memory stochastic gradient method for optimizing the control parameters of a linear reaction-diffusion PDE representing a fission reactor.
In mathematics and computer algebra, automatic differentiation (AD) is a set of techniques to evaluate the derivative of a function specified by a computer program. AD exploits the fact that every computer program, no matter how complicated, executes a sequence of elementary arithmetic operations (addition, subtraction, multiplication, division, etc.), elementary functions (exp, log, sin, cos, etc.) and control flow statements. AD takes source code of a function as input and produces source code of the derived function. By applying the chain rule repeatedly to these operations, derivatives of arbitrary order can be computed automatically, accurately to working precision, and using at most a small constant factor more arithmetic operations than the original program. This paper presents AD techniques available in ROOT, supported by Cling, to produce derivatives of arbitrary C/C++ functions through implementing source code transformation and employing the chain rule of differential calculus in both forward mode and reverse mode. We explain its current integration for gradient computation in TFormula. We demonstrate the correctness and performance improvements in ROOTs fitting algorithms.
In this paper we introduce DiffSharp, an automatic differentiation (AD) library designed with machine learning in mind. AD is a family of techniques that evaluate derivatives at machine precision with only a small constant factor of overhead, by syst ematically applying the chain rule of calculus at the elementary operator level. DiffSharp aims to make an extensive array of AD techniques available, in convenient form, to the machine learning community. These including arbitrary nesting of forward/reverse AD operations, AD with linear algebra primitives, and a functional API that emphasizes the use of higher-order functions and composition. The library exposes this functionality through an API that provides gradients, Hessians, Jacobians, directional derivatives, and matrix-free Hessian- and Jacobian-vector products. Bearing the performance requirements of the latest machine learning techniques in mind, the underlying computations are run through a high-performance BLAS/LAPACK backend, using OpenBLAS by default. GPU support is currently being implemented.
In this article, we discuss two specific classes of models - Gaussian Mixture Copula models and Mixture of Factor Analyzers - and the advantages of doing inference with gradient descent using automatic differentiation. Gaussian mixture models are a p opular class of clustering methods, that offers a principled statistical approach to clustering. However, the underlying assumption, that every mixing component is normally distributed, can often be too rigid for several real life datasets. In order to to relax the assumption about the normality of mixing components, a new class of parametric mixture models that are based on Copula functions - Gaussian Mixuture Copula Models were introduced. Estimating the parameters of the proposed Gaussian Mixture Copula Model (GMCM) through maximum likelihood has been intractable due to the positive semi-positive-definite constraints on the variance-covariance matrices. Previous attempts were limited to maximizing a proxy-likelihood which can be maximized using EM algorithm. These existing methods, even though easier to implement, does not guarantee any convergence nor monotonic increase of the GMCM Likelihood. In this paper, we use automatic differentiation tools to maximize the exact likelihood of GMCM, at the same time avoiding any constraint equations or Lagrange multipliers. We show how our method leads a monotonic increase in likelihood and converges to a (local) optimum value of likelihood. In this paper, we also show how Automatic Differentiation can be used for inference with Mixture of Factor Analyzers and advantages of doing so. We also discuss how this method also has all the properties such as monotonic increase in likelihood and convergence to a local optimum. Note that our work is also applicable to special cases of these two models - for e.g. Simple Copula models, Factor Analyzer model, etc.
We introduce a procedure to systematically search for a local unitary transformation that maps a wavefunction with a non-trivial sign structure into a positive-real form. The transformation is parametrized as a quantum circuit compiled into a set of one and two qubit gates. We design a cost function that maximizes the average sign of the output state and removes its complex phases. The optimization of the gates is performed through automatic differentiation algorithms, widely used in the machine learning community. We provide numerical evidence for significant improvements in the average sign for a two-leg triangular Heisenberg ladder with next-to-nearest neighbour and ring-exchange interactions. This model exhibits phases where the sign structure can be removed by simple local one-qubit unitaries, but also an exotic Bose-metal phase whose sign structure induces Bose surfaces with a fermionic character and a higher entanglement that requires deeper circuits.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا