ﻻ يوجد ملخص باللغة العربية
An important problem in large scale inference is the identification of variables that have large correlations or partial correlations. Recent work has yielded breakthroughs in the ultra-high dimensional setting when the sample size $n$ is fixed and the dimension $p rightarrow infty$ ([Hero, Rajaratnam 2011, 2012]). Despite these advances, the correlation screening framework suffers from some serious practical, methodological and theoretical deficiencies. For instance, theoretical safeguards for partial correlation screening requires that the population covariance matrix be block diagonal. This block sparsity assumption is however highly restrictive in numerous practical applications. As a second example, results for correlation and partial correlation screening framework requires the estimation of dependence measures or functionals, which can be highly prohibitive computationally. In this paper, we propose a unifying approach to correlation and partial correlation mining which specifically goes beyond the block diagonal correlation structure, thus yielding a methodology that is suitable for modern applications. By making connections to random geometric graphs, the number of highly correlated or partial correlated variables are shown to have novel compound Poisson finite-sample characterizations, which hold for both the finite $p$ case and when $p rightarrow infty$. The unifying framework also demonstrates an important duality between correlation and partial correlation screening with important theoretical and practical consequences.
Two Bayesian models with different sampling densities are said to be marginally equivalent if the joint distribution of observables and the parameter of interest is the same for both models. We discuss marginal equivalence in the general framework of
In this paper, we consider regression models with a Hilbert-space-valued predictor and a scalar response, where the response depends on the predictor only through a finite number of projections. The linear subspace spanned by these projections is cal
Hotellings T-squared test is a classical tool to test if the normal mean of a multivariate normal distribution is a specified one or the means of two multivariate normal means are equal. When the population dimension is higher than the sample size, t
A new goodness-of-fit test for normality in high-dimension (and Reproducing Kernel Hilbert Space) is proposed. It shares common ideas with the Maximum Mean Discrepancy (MMD) it outperforms both in terms of computation time and applicability to a wide
We consider a $l_1$-penalization procedure in the non-parametric Gaussian regression model. In many concrete examples, the dimension $d$ of the input variable $X$ is very large (sometimes depending on the number of observations). Estimation of a $bet