High-Dimensional Independence Testing and Maximum Marginal Correlation


الملخص بالإنكليزية

A number of universally consistent dependence measures have been recently proposed for testing independence, such as distance correlation, kernel correlation, multiscale graph correlation, etc. They provide a satisfactory solution for dependence testing in low-dimensions, but often exhibit decreasing power for high-dimensional data, a phenomenon that has been recognized but remains mostly unchartered. In this paper, we aim to better understand the high-dimensional testing scenarios and explore a procedure that is robust against increasing dimension. To that end, we propose the maximum marginal correlation method and characterize high-dimensional dependence structures via the notion of dependent dimensions. We prove that the maximum method can be valid and universally consistent for testing high-dimensional dependence under regularity conditions, and demonstrate when and how the maximum method may outperform other methods. The methodology can be implemented by most existing dependence measures, has a superior testing power in a variety of common high-dimensional settings, and is computationally efficient for big data analysis when using the distance correlation chi-square test.

تحميل البحث