No Arabic abstract
The asymptotic normality for a large family of eigenvalue statistics of a general sample covariance matrix is derived under the ultra-high dimensional setting, that is, when the dimension to sample size ratio $p/n to infty$. Based on this CLT result, we first adapt the covariance matrix test problem to the new ultra-high dimensional context. Then as a second application, we develop a new test for the separable covariance structure of a matrix-valued white noise. Simulation experiments are conducted for the investigation of finite-sample properties of the general asymptotic normality of eigenvalue statistics, as well as the second test for separable covariance structure of matrix-valued white noise.
We establish a quantitative version of the Tracy--Widom law for the largest eigenvalue of high dimensional sample covariance matrices. To be precise, we show that the fluctuations of the largest eigenvalue of a sample covariance matrix $X^*X$ converge to its Tracy--Widom limit at a rate nearly $N^{-1/3}$, where $X$ is an $M times N$ random matrix whose entries are independent real or complex random variables, assuming that both $M$ and $N$ tend to infinity at a constant rate. This result improves the previous estimate $N^{-2/9}$ obtained by Wang [73]. Our proof relies on a Green function comparison method [27] using iterative cumulant expansions, the local laws for the Green function and asymptotic properties of the correlation kernel of the white Wishart ensemble.
Portfolio managers faced with limited sample sizes must use factor models to estimate the covariance matrix of a high-dimensional returns vector. For the simplest one-factor market model, success rests on the quality of the estimated leading eigenvector beta. When only the returns themselves are observed, the practitioner has available the PCA estimate equal to the leading eigenvector of the sample covariance matrix. This estimator performs poorly in various ways. To address this problem in the high-dimension, limited sample size asymptotic regime and in the context of estimating the minimum variance portfolio, Goldberg, Papanicolau, and Shkolnik developed a shrinkage method (the GPS estimator) that improves the PCA estimator of beta by shrinking it toward a constant target unit vector. In this paper we continue their work to develop a more general framework of shrinkage targets that allows the practitioner to make use of further information to improve the estimator. Examples include sector separation of stock betas, and recent information from prior estimates. We prove some precise statements and illustrate the resulting improvements over the GPS estimator with some numerical experiments.
Recently, He and Owen (2016) proposed the use of Hilberts space filling curve (HSFC) in numerical integration as a way of reducing the dimension from $d>1$ to $d=1$. This paper studies the asymptotic normality of the HSFC-based estimate when using scrambled van der Corput sequence as input. We show that the estimate has an asymptotic normal distribution for functions in $C^1([0,1]^d)$, excluding the trivial case of constant functions. The asymptotic normality also holds for discontinuous functions under mild conditions. It was previously known only that scrambled $(0,m,d)$-net quadratures enjoy the asymptotic normality for smooth enough functions, whose mixed partial gradients satisfy a Holder condition. As a by-product, we find lower bounds for the variance of the HSFC-based estimate. Particularly, for nontrivial functions in $C^1([0,1]^d)$, the low bound is of order $n^{-1-2/d}$, which matches the rate of the upper bound established in He and Owen (2016).
We propose a novel approach to the analysis of covariance operators making use of concentration inequalities. First, non-asymptotic confidence sets are constructed for such operators. Then, subsequent applications including a k sample test for equality of covariance, a functional data classifier, and an expectation-maximization style clustering algorithm are derived and tested on both simulated and phoneme data.
Consider a $Ntimes n$ random matrix $Z_n=(Z^n_{j_1 j_2})$ where the individual entries are a realization of a properly rescaled stationary gaussian random field. The purpose of this article is to study the limiting empirical distribution of the eigenvalues of Gram random matrices such as $Z_n Z_n ^*$ and $(Z_n +A_n)(Z_n +A_n)^*$ where $A_n$ is a deterministic matrix with appropriate assumptions in the case where $nto infty$ and $frac Nn to c in (0,infty)$. The proof relies on related results for matrices with independent but not identically distributed entries and substantially differs from related works in the literature (Boutet de Monvel et al., Girko, etc.).