ترغب بنشر مسار تعليمي؟ اضغط هنا

148 - Wenkai Xu , Takeru Matsuda 2021
In many applications, we encounter data on Riemannian manifolds such as torus and rotation groups. Standard statistical procedures for multivariate data are not applicable to such data. In this study, we develop goodness-of-fit testing and interpreta ble model criticism methods for general distributions on Riemannian manifolds, including those with an intractable normalization constant. The proposed methods are based on extensions of kernel Stein discrepancy, which are derived from Stein operators on Riemannian manifolds. We discuss the connections between the proposed tests with existing ones and provide a theoretical analysis of their asymptotic Bahadur efficiency. Simulation results and real data applications show the validity of the proposed methods.
Wasserstein geometry and information geometry are two important structures to be introduced in a manifold of probability distributions. Wasserstein geometry is defined by using the transportation cost between two distributions, so it reflects the met ric of the base manifold on which the distributions are defined. Information geometry is defined to be invariant under reversible transformations of the base space. Both have their own merits for applications. In particular, statistical inference is based upon information geometry, where the Fisher metric plays a fundamental role, whereas Wasserstein geometry is useful in computer vision and AI applications. In this study, we analyze statistical inference based on the Wasserstein geometry in the case that the base space is one-dimensional. By using the location-scale model, we further derive the W-estimator that explicitly minimizes the transportation cost from the empirical distribution to a statistical model and study its asymptotic behaviors. We show that the W-estimator is consistent and explicitly give its asymptotic distribution by using the functional delta method. The W-estimator is Fisher efficient in the Gaussian case.
Matrix scaling is a classical problem with a wide range of applications. It is known that the Sinkhorn algorithm for matrix scaling is interpreted as alternating e-projections from the viewpoint of classical information geometry. Recently, a generali zation of matrix scaling to completely positive maps called operator scaling has been found to appear in various fields of mathematics and computer science, and the Sinkhorn algorithm has been extended to operator scaling. In this study, the operator Sinkhorn algorithm is studied from the viewpoint of quantum information geometry through the Choi representation of completely positive maps. The operator Sinkhorn algorithm is shown to coincide with alternating e-projections with respect to the symmetric logarithmic derivative metric, which is a Riemannian metric on the space of quantum states relevant to quantum estimation theory. Other types of alternating e-projections algorithms are also provided by using different information geometric structures on the positive definite cone.
This study computes the gradient of a function of numerical solutions of ordinary differential equations (ODEs) with respect to the initial condition. The adjoint method computes the gradient approximately by solving the corresponding adjoint system numerically. In this context, Sanz-Serna [SIAM Rev., 58 (2016), pp. 3--33] showed that when the initial value problem is solved by a Runge--Kutta (RK) method, the gradient can be exactly computed by applying an appropriate RK method to the adjoint system. Focusing on the case where the initial value problem is solved by a partitioned RK (PRK) method, this paper presents a numerical method, which can be seen as a generalization of PRK methods, for the adjoint system that gives the exact gradient.
121 - Wenkai Xu , Takeru Matsuda 2020
In many fields, data appears in the form of direction (unit vector) and usual statistical procedures are not applicable to such directional data. In this study, we propose non-parametric goodness-of-fit testing procedures for general directional dist ributions based on kernel Stein discrepancy. Our method is based on Steins operator on spheres, which is derived by using Stokes theorem. Notably, the proposed method is applicable to distributions with an intractable normalization constant, which commonly appear in directional statistics. Experimental results demonstrate that the proposed methods control type-I error well and have larger power than existing tests, including the test based on the maximum mean discrepancy.
We consider a scalar function depending on a numerical solution of an initial value problem, and its second-derivative (Hessian) matrix for the initial value. The need to extract the information of the Hessian or to solve a linear system having the H essian as a coefficient matrix arises in many research fields such as optimization, Bayesian estimation, and uncertainty quantification. From the perspective of memory efficiency, these tasks often employ a Krylov subspace method that does not need to hold the Hessian matrix explicitly and only requires computing the multiplication of the Hessian and a given vector. One of the ways to obtain an approximation of such Hessian-vector multiplication is to integrate the so-called second-order adjoint system numerically. However, the error in the approximation could be significant even if the numerical integration to the second-order adjoint system is sufficiently accurate. This paper presents a novel algorithm that computes the intended Hessian-vector multiplication exactly and efficiently. For this aim, we give a new concise derivation of the second-order adjoint system and show that the intended multiplication can be computed exactly by applying a particular numerical method to the second-order adjoint system. In the discussion, symplectic partitioned Runge--Kutta methods play an essential role.
We consider parameter estimation of ordinary differential equation (ODE) models from noisy observations. For this problem, one conventional approach is to fit numerical solutions (e.g., Euler, Runge--Kutta) of ODEs to data. However, such a method doe s not account for the discretization error in numerical solutions and has limited estimation accuracy. In this study, we develop an estimation method that quantifies the discretization error based on data. The key idea is to model the discretization error as random variables and estimate their variance simultaneously with the ODE parameter. The proposed method has the form of iteratively reweighted least squares, where the discretization error variance is updated with the isotonic regression algorithm and the ODE parameter is updated by solving a weighted least squares problem using the adjoint system. Experimental results demonstrate that the proposed method attains robust estimation with at least comparable accuracy to the conventional method by successfully quantifying the reliability of the numerical solutions.
Many statistical models are given in the form of non-normalized densities with an intractable normalization constant. Since maximum likelihood estimation is computationally intensive for these models, several estimation methods have been developed wh ich do not require explicit computation of the normalization constant, such as noise contrastive estimation (NCE) and score matching. However, model selection methods for general non-normalized models have not been proposed so far. In this study, we develop information criteria for non-normalized models estimated by NCE or score matching. They are approximately unbiased estimators of discrepancy measures for non-normalized models. Simulation results and applications to real data demonstrate that the proposed criteria enable selection of the appropriate non-normalized model in a data-driven manner.
We investigate predictive density estimation under the $L^2$ Wasserstein loss for location families and location-scale families. We show that plug-in densities form a complete class and that the Bayesian predictive density is given by the plug-in den sity with the posterior mean of the location and scale parameters. We provide Bayesian predictive densities that dominate the best equivariant one in normal models.
Several statistical models are given in the form of unnormalized densities, and calculation of the normalization constant is intractable. We propose estimation methods for such unnormalized models with missing data. The key concept is to combine impu tation techniques with estimators for unnormalized models including noise contrastive estimation and score matching. In addition, we derive asymptotic distributions of the proposed estimators and construct confidence intervals. Simulation results with truncated Gaussian graphical models and the application to real data of wind direction reveal that the proposed methods effectively enable statistical inference with unnormalized models from missing data.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا