ترغب بنشر مسار تعليمي؟ اضغط هنا

Normal approximation and concentration of spectral projectors of sample covariance

126   0   0.0 ( 0 )
 نشر من قبل Karim Lounici
 تاريخ النشر 2015
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

Let $X,X_1,dots, X_n$ be i.i.d. Gaussian random variables in a separable Hilbert space ${mathbb H}$ with zero mean and covariance operator $Sigma={mathbb E}(Xotimes X),$ and let $hat Sigma:=n^{-1}sum_{j=1}^n (X_jotimes X_j)$ be the sample (empirical) covariance operator based on $(X_1,dots, X_n).$ Denote by $P_r$ the spectral projector of $Sigma$ corresponding to its $r$-th eigenvalue $mu_r$ and by $hat P_r$ the empirical counterpart of $P_r.$ The main goal of the paper is to obtain tight bounds on $$ sup_{xin {mathbb R}} left|{mathbb P}left{frac{|hat P_r-P_r|_2^2-{mathbb E}|hat P_r-P_r|_2^2}{{rm Var}^{1/2}(|hat P_r-P_r|_2^2)}leq xright}-Phi(x)right|, $$ where $|cdot|_2$ denotes the Hilbert--Schmidt norm and $Phi$ is the standard normal distribution function. Such accuracy of normal approximation of the distribution of squared Hilbert--Schmidt error is characterized in terms of so called effective rank of $Sigma$ defined as ${bf r}(Sigma)=frac{{rm tr}(Sigma)}{|Sigma|_{infty}},$ where ${rm tr}(Sigma)$ is the trace of $Sigma$ and $|Sigma|_{infty}$ is its operator norm, as well as another parameter characterizing the size of ${rm Var}(|hat P_r-P_r|_2^2).$ Other results include non-asymptotic bounds and asymptotic representations for the mean squared Hilbert--Schmidt norm error ${mathbb E}|hat P_r-P_r|_2^2$ and the variance ${rm Var}(|hat P_r-P_r|_2^2),$ and concentration inequalities for $|hat P_r-P_r|_2^2$ around its expectation.



قيم البحث

اقرأ أيضاً

We consider general high-dimensional spiked sample covariance models and show that their leading sample spiked eigenvalues and their linear spectral statistics are asymptotically independent when the sample size and dimension are proportional to each other. As a byproduct, we also establish the central limit theorem of the leading sample spiked eigenvalues by removing the block diagonal assumption on the population covariance matrix, which is commonly needed in the literature. Moreover, we propose consistent estimators of the $L_4$ norm of the spiked population eigenvectors. Based on these results, we develop a new statistic to test the equality of two spiked population covariance matrices. Numerical studies show that the new test procedure is more powerful than some existing methods.
A sum of observations derived by a simple random sampling design from a population of independent random variables is studied. A procedure finding a general term of Edgeworth asymptotic expansion is presented. The Lindeberg condition of asymptotic no rmality, Berry-Esseen bound, Edgeworth asymptotic expansions under weakened conditions and Cramer type large deviation results are derived.
We study the asymptotic distributions of the spiked eigenvalues and the largest nonspiked eigenvalue of the sample covariance matrix under a general covariance matrix model with divergent spiked eigenvalues, while the other eigenvalues are bounded bu t otherwise arbitrary. The limiting normal distribution for the spiked sample eigenvalues is established. It has distinct features that the asymptotic mean relies on not only the population spikes but also the nonspikes and that the asymptotic variance in general depends on the population eigenvectors. In addition, the limiting Tracy-Widom law for the largest nonspiked sample eigenvalue is obtained. Estimation of the number of spikes and the convergence of the leading eigenvectors are also considered. The results hold even when the number of the spikes diverges. As a key technical tool, we develop a Central Limit Theorem for a type of random quadratic forms where the random vectors and random matrices involved are dependent. This result can be of independent interest.
Let $mathbf{X}_n=(x_{ij})$ be a $k times n$ data matrix with complex-valued, independent and standardized entries satisfying a Lindeberg-type moment condition. We consider simultaneously $R$ sample covariance matrices $mathbf{B}_{nr}=frac1n mathbf{Q} _r mathbf{X}_n mathbf{X}_n^*mathbf{Q}_r^top,~1le rle R$, where the $mathbf{Q}_{r}$s are nonrandom real matrices with common dimensions $ptimes k~(kgeq p)$. Assuming that both the dimension $p$ and the sample size $n$ grow to infinity, the limiting distributions of the eigenvalues of the matrices ${mathbf{B}_{nr}}$ are identified, and as the main result of the paper, we establish a joint central limit theorem for linear spectral statistics of the $R$ matrices ${mathbf{B}_{nr}}$. Next, this new CLT is applied to the problem of testing a high dimensional white noise in time series modelling. In experiments the derived test has a controlled size and is significantly faster than the classical permutation test, though it does have lower power. This application highlights the necessity of such joint CLT in the presence of several dependent sample covariance matrices. In contrast, all the existing works on CLT for linear spectral statistics of large sample covariance matrices deal with a single sample covariance matrix ($R=1$).
We consider the problem of estimating a low rank covariance function $K(t,u)$ of a Gaussian process $S(t), tin [0,1]$ based on $n$ i.i.d. copies of $S$ observed in a white noise. We suggest a new estimation procedure adapting simultaneously to the lo w rank structure and the smoothness of the covariance function. The new procedure is based on nuclear norm penalization and exhibits superior performances as compared to the sample covariance function by a polynomial factor in the sample size $n$. Other results include a minimax lower bound for estimation of low-rank covariance functions showing that our procedure is optimal as well as a scheme to estimate the unknown noise variance of the Gaussian process.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا