Particle Smoothing for Hidden Diffusion Processes: Adaptive Path Integral Smoother

54 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Hans-Christian Ruiz Dipl-Phys

تاريخ النشر 2016

مجال البحث الهندسة المعلوماتية الاحصاء الرياضي

والبحث باللغة English

تأليف H.-Ch. Ruiz - H. J. Kappen

التعلم الآلي حساب

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Particle smoothing methods are used for inference of stochastic processes based on noisy observations. Typically, the estimation of the marginal posterior distribution given all observations is cumbersome and computational intensive. In this paper, we propose a simple algorithm based on path integral control theory to estimate the smoothing distribution of continuous-time diffusion processes with partial observations. In particular, we use an adaptive importance sampling method to improve the effective sampling size of the posterior over processes given the observations and the reliability of the estimation of the marginals. This is achieved by estimating a feedback controller to sample efficiently from the joint smoothing distributions. We compare the results with estimations obtained from the standard Forward Filter/Backward Simulator for two diffusion processes of different complexity. We show that the proposed method gives more reliable estimations than the standard FFBSi when the smoothing distribution is poorly represented by the filter distribution.

قيم البحث

80 - Dominik Thalmeier , Hilbert J. Kappen , Simone Totaro 2020

In Path Integral control problems a representation of an optimally controlled dynamical system can be formally computed and serve as a guidepost to learn a parametrized policy. The Path Integral Cross-Entropy (PICE) method tries to exploit this, but is hampered by poor sample efficiency. We propose a model-free algorithm called ASPIC (Adaptive Smoothing of Path Integral Control) that applies an inf-convolution to the cost function to speedup convergence of policy optimization. We identify PICE as the infinite smoothing limit of such technique and show that the sample efficiency problems that PICE suffers disappear for finite levels of smoothing. For zero smoothing this method becomes a greedy optimization of the cost, which is the standard approach in current reinforcement learning. We show analytically and empirically that intermediate levels of smoothing are optimal, which renders the new method superior to both PICE and direct cost-optimization.

أنظمة وتحكم التعلم الآلي أنظمة وتحكم

Adaptive and non-adaptive estimation for degenerate diffusion processes

70 - Arnaud Gloter , Nakahiro Yoshida 2020

We discuss parametric estimation of a degenerate diffusion system from time-discrete observations. The first component of the degenerate diffusion system has a parameter $theta_1$ in a non-degenerate diffusion coefficient and a parameter $theta_2$ in the drift term. The second component has a drift term parameterized by $theta_3$ and no diffusion term. Asymptotic normality is proved in three different situations for an adaptive estimator for $theta_3$ with some initial estimators for ($theta_1$ , $theta_2$), an adaptive one-step estimator for ($theta_1$ , $theta_2$ , $theta_3$) with some initial estimators for them, and a joint quasi-maximum likelihood estimator for ($theta_1$ , $theta_2$ , $theta_3$) without any initial estimator. Our estimators incorporate information of the increments of both components. Thanks to this construction, the asymptotic variance of the estimators for $theta_1$ is smaller than the standard one based only on the first component. The convergence of the estimators for $theta_3$ is much faster than the other parameters. The resulting asymptotic variance is smaller than that of an estimator only using the increments of the second component.

نظرية الإحصاء نظرية الإحصاء

Regularization via Adaptive Pairwise Label Smoothing

163 - Hongyu Guo 2020

Label Smoothing (LS) is an effective regularizer to improve the generalization of state-of-the-art deep models. For each training sample the LS strategy smooths the one-hot encoded training signal by distributing its distribution mass over the non gr ound-truth classes, aiming to penalize the networks from generating overconfident output distributions. This paper introduces a novel label smoothing technique called Pairwise Label Smoothing (PLS). The PLS takes a pair of samples as input. Smoothing with a pair of ground-truth labels enables the PLS to preserve the relative distance between the two truth labels while further soften that between the truth labels and the other targets, resulting in models producing much less confident predictions than the LS strategy. Also, unlike current LS methods, which typically require to find a global smoothing distribution mass through cross-validation search, PLS automatically learns the distribution mass for each input pair during training. We empirically show that PLS significantly outperforms LS and the baseline models, achieving up to 30% of relative classification error reduction. We also visually show that when achieving such accuracy gains the PLS tends to produce very low winning softmax scores.

التعلم الآلي التعلم الالي

Path integral quantization of a spinning particle

322 - Jerzy Kowalski-Glikman , Giacomo Rosati 2019

Following the idea of Alekseev and Shatashvili we derive the path integral quantization of a modified relativistic particle action that results in the Feynman propagator of a free field with arbitrary spin. This propagator can be associated with the Duffin, Kemmer, and Petiau (DKP) form of a free field theory. We show explicitly that the obtained DKP propagator is equivalent to the standard one, for spins 0 and 1. We argue that this equivalence holds also for higher spins.

الفيزياء عالية الطاقة - النظرية النسبية العامة وهدية الكونيات الكم

Adaptive Smoothing for Trajectory Reconstruction

391 - Zhanglong Cao , David Bryant , Tim Molteno 2018

Trajectory reconstruction is the process of inferring the path of a moving object between successive observations. In this paper, we propose a smoothing spline -- which we name the V-spline -- that incorporates position and velocity information and a penalty term that controls acceleration. We introduce a particular adaptive V-spline designed to control the impact of irregularly sampled observations and noisy velocity measurements. A cross-validation scheme for estimating the V-spline parameters is given and we detail the performance of the V-spline on four particularly challenging test datasets. Finally, an application of the V-spline to vehicle trajectory reconstruction in two dimensions is given, in which the penalty term is allowed to further depend on known operational characteristics of the vehicle.

المنهجية