ﻻ يوجد ملخص باللغة العربية
A number of universally consistent dependence measures have been recently proposed for testing independence, such as distance correlation, kernel correlation, multiscale graph correlation, etc. They provide a satisfactory solution for dependence testing in low-dimensions, but often exhibit decreasing power for high-dimensional data, a phenomenon that has been recognized but remains mostly unchartered. In this paper, we aim to better understand the high-dimensional testing scenarios and explore a procedure that is robust against increasing dimension. To that end, we propose the maximum marginal correlation method and characterize high-dimensional dependence structures via the notion of dependent dimensions. We prove that the maximum method can be valid and universally consistent for testing high-dimensional dependence under regularity conditions, and demonstrate when and how the maximum method may outperform other methods. The methodology can be implemented by most existing dependence measures, has a superior testing power in a variety of common high-dimensional settings, and is computationally efficient for big data analysis when using the distance correlation chi-square test.
Complex data structures such as time series are increasingly present in modern data science problems. A fundamental question is whether two such time-series are statistically dependent. Many current approaches make parametric assumptions on the rando
The $k$-sample testing problem tests whether or not $k$ groups of data points are sampled from the same distribution. Multivariate analysis of variance (MANOVA) is currently the gold standard for $k$-sample testing but makes strong, often inappropria
We consider the hypothesis testing problem of detecting conditional dependence, with a focus on high-dimensional feature spaces. Our contribution is a new test statistic based on samples from a generative adversarial network designed to approximate d
In this article, we consider the problem of high-dimensional conditional independence testing, which is a key building block in statistics and machine learning. We propose a double generative adversarial networks (GANs)-based inference procedure. We
In this work, we study the transfer learning problem under high-dimensional generalized linear models (GLMs), which aim to improve the fit on target data by borrowing information from useful source data. Given which sources to transfer, we propose an