ترغب بنشر مسار تعليمي؟ اضغط هنا

A Wasserstein index of dependence for random measures

119   0   0.0 ( 0 )
 نشر من قبل Hugo Lavenant
 تاريخ النشر 2021
  مجال البحث
والبحث باللغة English




اسأل ChatGPT حول البحث

Nonparametric latent structure models provide flexible inference on distinct, yet related, groups of observations. Each component of a vector of $d ge 2$ random measures models the distribution of a group of exchangeable observations, while their dependence structure regulates the borrowing of information across different groups. Recent work has quantified the dependence between random measures in terms of Wasserstein distance from the maximally dependent scenario when $d=2$. By solving an intriguing max-min problem we are now able to define a Wasserstein index of dependence $I_mathcal{W}$ with the following properties: (i) it simultaneously quantifies the dependence of $d ge 2$ random measures; (ii) it takes values in [0,1]; (iii) it attains the extreme values ${0,1}$ under independence and complete dependence, respectively; (iv) since it is defined in terms of the underlying Levy measures, it is possible to evaluate it numerically in many Bayesian nonparametric models for partially exchangeable data.


قيم البحث

اقرأ أيضاً

Dependence measures based on reproducing kernel Hilbert spaces, also known as Hilbert-Schmidt Independence Criterion and denoted HSIC, are widely used to statistically decide whether or not two random vectors are dependent. Recently, non-parametric H SIC-based statistical tests of independence have been performed. However, these tests lead to the question of the choice of the kernels associated to the HSIC. In particular, there is as yet no method to objectively select specific kernels with theoretical guarantees in terms of first and second kind errors. One of the main contributions of this work is to develop a new HSIC-based aggregated procedure which avoids such a kernel choice, and to provide theoretical guarantees for this procedure. To achieve this, we first introduce non-asymptotic single tests based on Gaussian kernels with a given bandwidth, which are of prescribed level $alpha in (0,1)$. From a theoretical point of view, we upper-bound their uniform separation rate of testing over Sobolev and Nikolskii balls. Then, we aggregate several single tests, and obtain similar upper-bounds for the uniform separation rate of the aggregated procedure over the same regularity spaces. Another main contribution is that we provide a lower-bound for the non-asymptotic minimax separation rate of testing over Sobolev balls, and deduce that the aggregated procedure is adaptive in the minimax sense over such regularity spaces. Finally, from a practical point of view, we perform numerical studies in order to assess the efficiency of our aggregated procedure and compare it to existing independence tests in the literature.
Consider a standard white Wishart matrix with parameters $n$ and $p$. Motivated by applications in high-dimensional statistics and signal processing, we perform asymptotic analysis on the maxima and minima of the eigenvalues of all the $m times m$ pr incipal minors, under the asymptotic regime that $n,p,m$ go to infinity. Asymptotic results concerning extreme eigenvalues of principal minors of real Wigner matrices are also obtained. In addition, we discuss an application of the theoretical results to the construction of compressed sensing matrices, which provides insights to compressed sensing in signal processing and high dimensional linear regression in statistics.
We extend classic characterisations of posterior distributions under Dirichlet process and gamma random measures priors to a dynamic framework. We consider the problem of learning, from indirect observations, two families of time-dependent processes of interest in Bayesian nonparametrics: the first is a dependent Dirichlet process driven by a Fleming-Viot model, and the data are random samples from the process state at discrete times; the second is a collection of dependent gamma random measures driven by a Dawson-Watanabe model, and the data are collected according to a Poisson point process with intensity given by the process state at discrete times. Both driving processes are diffusions taking values in the space of discrete measures whose support varies with time, and are stationary and reversible with respect to Dirichlet and gamma priors respectively. A common methodology is developed to obtain in closed form the time-marginal posteriors given past and present data. These are shown to belong to classes of finite mixtures of Dirichlet processes and gamma random measures for the two models respectively, yielding conjugacy of these classes to the type of data we consider. We provide explicit results on the parameters of the mixture components and on the mixing weights, which are time-varying and drive the mixtures towards the respective priors in absence of further data. Explicit algorithms are provided to recursively compute the parameters of the mixtures. Our results are based on the projective properties of the signals and on certain duality properties of their projections.
171 - Rahul Agarwal , Pierre Sacre , 2015
In data science, it is often required to estimate dependencies between different data sources. These dependencies are typically calculated using Pearsons correlation, distance correlation, and/or mutual information. However, none of these measures sa tisfy all the Grangers axioms for an ideal measure. One such ideal measure, proposed by Granger himself, calculates the Bhattacharyya distance between the joint probability density function (pdf) and the product of marginal pdfs. We call this measure the mutual dependence. However, to date this measure has not been directly computable from data. In this paper, we use our recently introduced maximum likelihood non-parametric estimator for band-limited pdfs, to compute the mutual dependence directly from the data. We construct the estimator of mutual dependence and compare its performance to standard measures (Pearsons and distance correlation) for different known pdfs by computing convergence rates, computational complexity, and the ability to capture nonlinear dependencies. Our mutual dependence estimator requires fewer samples to converge to theoretical values, is faster to compute, and captures more complex dependencies than standard measures.
123 - Yandi Shen , Fang Han , 2020
We establish exponential inequalities for a class of V-statistics under strong mixing conditions. Our theory is developed via a novel kernel expansion based on random Fourier features and the use of a probabilistic method. This type of expansion is n ew and useful for handling many notorious classes of kernels.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا