ترغب بنشر مسار تعليمي؟ اضغط هنا

A Test for Independence Via Bayesian Nonparametric Estimation of Mutual Information

95   0   0.0 ( 0 )
 نشر من قبل Luai Al-Labadi Dr.
 تاريخ النشر 2020
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

Mutual information is a well-known tool to measure the mutual dependence between variables. In this paper, a Bayesian nonparametric estimation of mutual information is established by means of the Dirichlet process and the $k$-nearest neighbor distance. As a direct outcome of the estimation, an easy-to-implement test of independence is introduced through the relative belief ratio. Several theoretical properties of the approach are presented. The procedure is investigated through various examples where the results are compared to its frequentist counterpart and demonstrate a good performance.

قيم البحث

اقرأ أيضاً

Mutual information is a widely-used information theoretic measure to quantify the amount of association between variables. It is used extensively in many applications such as image registration, diagnosis of failures in electrical machines, pattern r ecognition, data mining and tests of independence. The main goal of this paper is to provide an efficient estimator of the mutual information based on the approach of Al Labadi et. al. (2021). The estimator is explored through various examples and is compared to its frequentist counterpart due to Berrett et al. (2019). The results show the good performance of the procedure by having a smaller mean squared error.
We derive independence tests by means of dependence measures thresholding in a semiparametric context. Precisely, estimates of phi-mutual informations, associated to phi-divergences between a joint distribution and the product distribution of its mar gins, are derived through the dual representation of phi-divergences. The asymptotic properties of the proposed estimates are established, including consistency, asymptotic distributions and large deviations principle. The obtained tests of independence are compared via their relative asymptotic Bahadur efficiency and numerical simulations. It follows that the proposed semiparametric Kullback-Leibler Mutual information test is the optimal one. On the other hand, the proposed approach provides a new method for estimating the Kullback-Leibler mutual information in a semiparametric setting, as well as a model selection procedure in large class of dependency models including semiparametric copulas.
291 - Yunbo Ouyang , Feng Liang 2017
A nonparametric Bayes approach is proposed for the problem of estimating a sparse sequence based on Gaussian random variables. We adopt the popular two-group prior with one component being a point mass at zero, and the other component being a mixture of Gaussian distributions. Although the Gaussian family has been shown to be suboptimal for this problem, we find that Gaussian mixtures, with a proper choice on the means and mixing weights, have the desired asymptotic behavior, e.g., the corresponding posterior concentrates on balls with the desired minimax rate. To achieve computation efficiency, we propose to obtain the posterior distribution using a deterministic variational algorithm. Empirical studies on several benchmark data sets demonstrate the superior performance of the proposed algorithm compared to other alternatives.
We consider settings in which the data of interest correspond to pairs of ordered times, e.g, the birth times of the first and second child, the times at which a new user creates an account and makes the first purchase on a website, and the entry and survival times of patients in a clinical trial. In these settings, the two times are not independent (the second occurs after the first), yet it is still of interest to determine whether there exists significant dependence {em beyond} their ordering in time. We refer to this notion as quasi-(in)dependence. For instance, in a clinical trial, to avoid biased selection, we might wish to verify that recruitment times are quasi-independent of survival times, where dependencies might arise due to seasonal effects. In this paper, we propose a nonparametric statistical test of quasi-independence. Our test considers a potentially infinite space of alternatives, making it suitable for complex data where the nature of the possible quasi-dependence is not known in advance. Standard parametric approaches are recovered as special cases, such as the classical conditional Kendalls tau, and log-rank tests. The tests apply in the right-censored setting: an essential feature in clinical trials, where patients can withdraw from the study. We provide an asymptotic analysis of our test-statistic, and demonstrate in experiments that our test obtains better power than existing approaches, while being more computationally efficient.
In spatial statistics, it is often assumed that the spatial field of interest is stationary and its covariance has a simple parametric form, but these assumptions are not appropriate in many applications. Given replicate observations of a Gaussian sp atial field, we propose nonstationary and nonparametric Bayesian inference on the spatial dependence. Instead of estimating the quadratic (in the number of spatial locations) entries of the covariance matrix, the idea is to infer a near-linear number of nonzero entries in a sparse Cholesky factor of the precision matrix. Our prior assumptions are motivated by recent results on the exponential decay of the entries of this Cholesky factor for Matern-type covariances under a specific ordering scheme. Our methods are highly scalable and parallelizable. We conduct numerical comparisons and apply our methodology to climate-model output, enabling statistical emulation of an expensive physical model.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا