ترغب بنشر مسار تعليمي؟ اضغط هنا

Eigenvalue distribution of a high-dimensional distance covariance matrix with application

344   0   0.0 ( 0 )
 نشر من قبل Jianfeng Yao
 تاريخ النشر 2021
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

We introduce a new random matrix model called distance covariance matrix in this paper, whose normalized trace is equivalent to the distance covariance. We first derive a deterministic limit for the eigenvalue distribution of the distance covariance matrix when the dimensions of the vectors and the sample size tend to infinity simultaneously. This limit is valid when the vectors are independent or weakly dependent through a finite-rank perturbation. It is also universal and independent of the details of the distributions of the vectors. Furthermore, the top eigenvalues of this distance covariance matrix are shown to obey an exact phase transition when the dependence of the vectors is of finite rank. This finding enables the construction of a new detector for such weak dependence where classical methods based on large sample covariance matrices or sample canonical correlations may fail in the considered high-dimensional framework.



قيم البحث

اقرأ أيضاً

Let $mathbf{X}_n=(x_{ij})$ be a $k times n$ data matrix with complex-valued, independent and standardized entries satisfying a Lindeberg-type moment condition. We consider simultaneously $R$ sample covariance matrices $mathbf{B}_{nr}=frac1n mathbf{Q} _r mathbf{X}_n mathbf{X}_n^*mathbf{Q}_r^top,~1le rle R$, where the $mathbf{Q}_{r}$s are nonrandom real matrices with common dimensions $ptimes k~(kgeq p)$. Assuming that both the dimension $p$ and the sample size $n$ grow to infinity, the limiting distributions of the eigenvalues of the matrices ${mathbf{B}_{nr}}$ are identified, and as the main result of the paper, we establish a joint central limit theorem for linear spectral statistics of the $R$ matrices ${mathbf{B}_{nr}}$. Next, this new CLT is applied to the problem of testing a high dimensional white noise in time series modelling. In experiments the derived test has a controlled size and is significantly faster than the classical permutation test, though it does have lower power. This application highlights the necessity of such joint CLT in the presence of several dependent sample covariance matrices. In contrast, all the existing works on CLT for linear spectral statistics of large sample covariance matrices deal with a single sample covariance matrix ($R=1$).
The consistency and asymptotic normality of the spatial sign covariance matrix with unknown location are shown. Simulations illustrate the different asymptotic behavior when using the mean and the spatial median as location estimator.
Covariance matrix testing for high dimensional data is a fundamental problem. A large class of covariance test statistics based on certain averaged spectral statistics of the sample covariance matrix are known to obey central limit theorems under the null. However, precise understanding for the power behavior of the corresponding tests under general alternatives remains largely unknown. This paper develops a general method for analyzing the power behavior of covariance test statistics via accurate non-asymptotic power expansions. We specialize our general method to two prototypical settings of testing identity and sphericity, and derive sharp power expansion for a number of widely used tests, including the likelihood ratio tests, Ledoit-Nagao-Wolfs test, Cai-Mas test and Johns test. The power expansion for each of those tests holds uniformly over all possible alternatives under mild growth conditions on the dimension-to-sample ratio. Interestingly, although some of those tests are previously known to share the same limiting power behavior under spiked covariance alternatives with a fixed number of spikes, our new power characterizations indicate that such equivalence fails when many spikes exist. The proofs of our results combine techniques from Poincare-type inequalities, random matrices and zonal polynomials.
The asymptotic variance of the maximum likelihood estimate is proved to decrease when the maximization is restricted to a subspace that contains the true parameter value. Maximum likelihood estimation allows a systematic fitting of covariance models to the sample, which is important in data assimilation. The hierarchical maximum likelihood approach is applied to the spectral diagonal covariance model with different parameterizations of eigenvalue decay, and to the sparse inverse covariance model with specified parameter values on different sets of nonzero entries. It is shown computationally that using smaller sets of parameters can decrease the sampling noise in high dimension substantially.
In this paper we study covariance estimation with missing data. We consider missing data mechanisms that can be independent of the data, or have a time varying dependency. Additionally, observed variables may have arbitrary (non uniform) and dependen t observation probabilities. For each mechanism, we construct an unbiased estimator and obtain bounds for the expected value of their estimation error in operator norm. Our bounds are equivalent, up to constant and logarithmic factors, to state of the art bounds for complete and uniform missing observations. Furthermore, for the more general non uniform and dependent cases, the proposed bounds are new or improve upon previous results. Our error estimates depend on quantities we call scaled effective rank, which generalize the effective rank to account for missing observations. All the estimators studied in this work have the same asymptotic convergence rate (up to logarithmic factors).
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا