Do you want to publish a course? Click here

Measuring and testing dependence by correlation of distances

282   0   0.0 ( 0 )
 Publication date 2008
and research's language is English




Ask ChatGPT about the research

Distance correlation is a new measure of dependence between random vectors. Distance covariance and distance correlation are analogous to product-moment covariance and correlation, but unlike the classical definition of correlation, distance correlation is zero only if the random vectors are independent. The empirical distance dependence measures are based on certain Euclidean distances between sample elements rather than sample moments, yet have a compact representation analogous to the classical covariance and correlation. Asymptotic properties and applications in testing independence are discussed. Implementation of the test and Monte Carlo results are also presented.



rate research

Read More

The Gaussian model equips strong properties that facilitate studying and interpreting graphical models. Specifically it reduces conditional independence and the study of positive association to determining partial correlations and their signs. When Gaussianity does not hold partial correlation graphs are a useful relaxation of graphical models, but it is not clear what information they contain (besides the obvious lack of linear association). We study elliptical and transelliptical distributions as middle-ground between the Gaussian and other families that are more flexible but either do not embed strong properties or do not lead to simple interpretation. We characterize the meaning of zero partial correlations in the elliptical family and transelliptical copula models and show that it retains much of the dependence structure from the Gaussian case. Regarding positive dependence, we prove impossibility results to learn (trans)elliptical graphical models, including that an elliptical distribution that is multivariate totally positive of order two for all dimensions must be essentially Gaussian. We then show how to interpret positive partial correlations as a relaxation, and obtain important properties related to faithfulness and Simpsons paradox. We illustrate the transelliptical model potential to study tail dependence in S&P500 data, and of positivity to improve regularized inference.
We discuss a general approach to handling multiple hypotheses testing in the case when a particular hypothesis states that the vector of parameters identifying the distribution of observations belongs to a convex compact set associated with the hypothesis. With our approach, this problem reduces to testing the hypotheses pairwise. Our central result is a test for a pair of hypotheses of the outlined type which, under appropriate assumptions, is provably nearly optimal. The test is yielded by a solution to a convex programming problem, so that our construction admits computationally efficient implementation. We further demonstrate that our assumptions are satisfied in several important and interesting applications. Finally, we show how our approach can be applied to a rather general detection problem encompassing several classical statistical settings such as detection of abrupt signal changes, cusp detection and multi-sensor detection.
Rejecting the null hypothesis in two-sample testing is a fundamental tool for scientific discovery. Yet, aside from concluding that two samples do not come from the same probability distribution, it is often of interest to characterize how the two distributions differ. Given samples from two densities $f_1$ and $f_0$, we consider the task of localizing occurrences of the inequality $f_1 > f_0$. To avoid the challenges associated with high-dimensional space, we propose a general hypothesis testing framework where hypotheses are formulated adaptively to the data by conditioning on the combined sample from the two densities. We then investigate a special case of this framework where the notion of locality is captured by a random walk on a weighted graph constructed over this combined sample. We derive a tractable testing procedure for this case employing a type of scan statistic, and provide non-asymptotic lower bounds on the power and accuracy of our test to detect whether $f_1>f_0$ in a local sense. Furthermore, we characterize the tests consistency according to a certain problem-hardness parameter, and show that our test achieves the minimax detection rate for this parameter. We conduct numerical experiments to validate our method, and demonstrate our approach on two real-world applications: detecting and localizing arsenic well contamination across the United States, and analyzing two-sample single-cell RNA sequencing data from melanoma patients.
This work builds a unified framework for the study of quadratic form distance measures as they are used in assessing the goodness of fit of models. Many important procedures have this structure, but the theory for these methods is dispersed and incomplete. Central to the statistical analysis of these distances is the spectral decomposition of the kernel that generates the distance. We show how this determines the limiting distribution of natural goodness-of-fit tests. Additionally, we develop a new notion, the spectral degrees of freedom of the test, based on this decomposition. The degrees of freedom are easy to compute and estimate, and can be used as a guide in the construction of useful procedures in this class.
76 - Yuri Golubev 2019
In this paper, we develop Bayes and maximum a posteriori probability (MAP) approaches to monotonicity testing. In order to simplify this problem, we consider a simple white Gaussian noise model and with the help of the Haar transform we reduce it to the equivalent problem of testing positivity of the Haar coefficients. This approach permits, in particular, to understand links between monotonicity testing and sparse vectors detection, to construct new tests, and to prove their optimality without supplementary assumptions. The main idea in our construction of multi-level tests is based on some invariance properties of specific probability distributions. Along with Bayes and MAP tests, we construct also adaptive multi-level tests that are free from the prior information about the sizes of non-monotonicity segments of the function.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا