Do you want to publish a course? Click here

The asymptotic normality for a large family of eigenvalue statistics of a general sample covariance matrix is derived under the ultra-high dimensional setting, that is, when the dimension to sample size ratio $p/n to infty$. Based on this CLT result, we first adapt the covariance matrix test problem to the new ultra-high dimensional context. Then as a second application, we develop a new test for the separable covariance structure of a matrix-valued white noise. Simulation experiments are conducted for the investigation of finite-sample properties of the general asymptotic normality of eigenvalue statistics, as well as the second test for separable covariance structure of matrix-valued white noise.
In this paper, we analyse singular values of a large $ptimes n$ data matrix $mathbf{X}_n= (mathbf{x}_{n1},ldots,mathbf{x}_{nn})$ where the column $mathbf{x}_{nj}$s are independent $p$-dimensional vectors, possibly with different distributions. Such data matrices are common in high-dimensional statistics. Under a key assumption that the covariance matrices $mathbf{Sigma}_{nj}=text{Cov}(mathbf{x}_{nj})$ can be asymptotically simultaneously diagonalizable, and appropriate convergence of their spectra, we establish a limiting distribution for the singular values of $mathbf{X}_n$ when both dimension $p$ and $n$ grow to infinity in a comparable magnitude. The matrix model goes beyond and includes many existing works on different types of sample covariance matrices, including the weighted sample covariance matrix, the Gram matrix model and the sample covariance matrix of linear times series models. Furthermore, we develop two applications of our general approach. First, we obtain the existence and uniqueness of a new limiting spectral distribution of realized covariance matrices for a multi-dimensional diffusion process with anisotropic time-varying co-volatility processes. Secondly, we derive the limiting spectral distribution for singular values of the data matrix for a recent matrix-valued auto-regressive model. Finally, for a generalized finite mixture model, the limiting spectral distribution for singular values of the data matrix is obtained.
We reexamine the classical linear regression model when the model is subject to two types of uncertainty: (i) some of covariates are either missing or completely inaccessible, and (ii) the variance of the measurement error is undetermined and changing according to a mechanism unknown to the statistician. By following the recent theory of sublinear expectation, we propose to characterize such mean and variance uncertainty in the response variable by two specific nonlinear random variables, which encompass an infinite family of probability distributions for the response variable in the sense of (linear) classical probability theory. The approach enables a family of estimators under various loss functions for the regression parameter and the parameters related to model uncertainty. The consistency of the estimators is established under mild conditions on the data generation process. Three applications are introduced to assess the quality of the approach including a forecasting model for the S&P Index.
We introduce a new random matrix model called distance covariance matrix in this paper, whose normalized trace is equivalent to the distance covariance. We first derive a deterministic limit for the eigenvalue distribution of the distance covariance matrix when the dimensions of the vectors and the sample size tend to infinity simultaneously. This limit is valid when the vectors are independent or weakly dependent through a finite-rank perturbation. It is also universal and independent of the details of the distributions of the vectors. Furthermore, the top eigenvalues of this distance covariance matrix are shown to obey an exact phase transition when the dependence of the vectors is of finite rank. This finding enables the construction of a new detector for such weak dependence where classical methods based on large sample covariance matrices or sample canonical correlations may fail in the considered high-dimensional framework.
Inference of population structure from genetic data plays an important role in population and medical genetics studies. The traditional EIGENSTRAT method has been widely used for computing and selecting top principal components that capture population structure information (Price et al., 2006). With the advancement and decreasing cost of sequencing technology, whole-genome sequencing data provide much richer information about the underlying population structures. However, the EIGENSTRAT method was originally developed for analyzing array-based genotype data and thus may not perform well on sequencing data for two reasons. First, the number of genetic variants $p$ is much larger than the sample size $n$ in sequencing data such that the sample-to-marker ratio $n/p$ is nearly zero, violating the assumption of the Tracy-Widom test used in the EIGENSTRAT method. Second, the EIGENSTRAT method might not be able to handle the linkage disequilibrium (LD) well in sequencing data. To resolve those two critical issues, we propose a new statistical method called ERStruct to estimate the number of latent sub-populations based on sequencing data. We propose to use the ratio of successive eigenvalues as a more robust testing statistic, and then we approximate the null distribution of our proposed test statistic using modern random matrix theory. Simulation studies found that our proposed ERStruct method has outperformed the traditional Tracy-Widom test on sequencing data. We further use two public data sets from the HapMap 3 and the 1000 Genomes Projects to demonstrate the performance of our ERStruct method. We also implement our ERStruct in a MATLAB toolbox which is now publicly available on GitHub through https://github.com/bglvly/ERStruct.
This paper reexamines the seminal Lagrange multiplier test for cross-section independence in a large panel model where both the number of cross-sectional units n and the number of time series observations T can be large. The first contribution of the paper is an enlargement of the test with two extensions: firstly the new asymptotic normality is derived in a simultaneous limiting scheme where the two dimensions (n, T) tend to infinity with comparable magnitudes; second, the result is valid for general error distribution (not necessarily normal). The second contribution of the paper is a new test statistic based on the sum of the fourth powers of cross-section correlations from OLS residuals, instead of their squares used in the Lagrange multiplier statistic. This new test is generally more powerful, and the improvement is particularly visible against alternatives with weak or sparse cross-section dependence. Both simulation study and real data analysis are proposed to demonstrate the advantages of the enlarged Lagrange multiplier test and the power enhanced test in comparison with the existing procedures.
157 - Jianfeng Yao , Wangjun Yuan 2020
In this article, we establish a limiting distribution for eigenvalues of a class of auto-covariance matrices. The same distribution has been found in the literature for a regularized version of these auto-covariance matrices. The original non-regularized auto-covariance matrices are non invertible which introduce supplementary diffculties for the study of their eigenvalues through Girkos Hermitization scheme. The key result in this paper is a new polynomial lower bound for the least singular value of the resolvent matrices associated to a rank-defective quadratic function of a random matrix with independent and identically distributed entries. Another improvement in the paper is that the lag of the auto-covariance matrices can grow to infinity with the matrix dimension.
Consider a $p$-dimensional population ${mathbf x} inmathbb{R}^p$ with iid coordinates in the domain of attraction of a stable distribution with index $alphain (0,2)$. Since the variance of ${mathbf x}$ is infinite, the sample covariance matrix ${mathbf S}_n=n^{-1}sum_{i=1}^n {{mathbf x}_i}{mathbf x}_i$ based on a sample ${mathbf x}_1,ldots,{mathbf x}_n$ from the population is not well behaved and it is of interest to use instead the sample correlation matrix ${mathbf R}_n= {operatorname{diag}({mathbf S}_n)}^{-1/2}, {mathbf S}_n {operatorname{diag}({mathbf S}_n)}^{-1/2}$. This paper finds the limiting distributions of the eigenvalues of ${mathbf R}_n$ when both the dimension $p$ and the sample size $n$ grow to infinity such that $p/nto gamma in (0,infty)$. The family of limiting distributions ${H_{alpha,gamma}}$ is new and depends on the two parameters $alpha$ and $gamma$. The moments of $H_{alpha,gamma}$ are fully identified as sum of two contributions: the first from the classical Marv{c}enko-Pastur law and a second due to heavy tails. Moreover, the family ${H_{alpha,gamma}}$ has continuous extensions at the boundaries $alpha=2$ and $alpha=0$ leading to the Marv{c}enko-Pastur law and a modified Poisson distribution, respectively. Our proofs use the method of moments, the path-shortening algorithm developed in [18] and some novel graph counting combinatorics. As a consequence, the moments of $H_{alpha,gamma}$ are expressed in terms of combinatorial objects such as Stirling numbers of the second kind. A simulation study on these limiting distributions $H_{alpha,gamma}$ is also provided for comparison with the Marv{c}enko-Pastur law.
In this article, we study high-dimensional behavior of empirical spectral distributions ${L_N(t), tin[0,T]}$ for a class of $Ntimes N$ symmetric/Hermitian random matrices, whose entries are generated from the solution of stochastic differential equation driven by fractional Brownian motion with Hurst parameter $H in(1/2,1)$. For Wigner-type matrices, we obtain almost sure relative compactness of ${L_N(t), tin[0,T]}_{Ninmathbb N}$ in $C([0,T], mathbf P(mathbb R))$ following the approach in cite{Anderson2010}; for Wishart-type matrices, we obtain tightness of ${L_N(t), tin[0,T]}_{Ninmathbb N}$ on $C([0,T], mathbf P(mathbb R))$ by tightness criterions provided in Appendix ref{subset:tightness argument}. The limit of ${L_N(t), tin[0,T]}$ as $Nto infty$ is also characterised.
We consider eigenvalues of generalized Wishart processes as well as particle systems, of which the empirical measures converge to deterministic measures as the dimension goes to infinity. In this paper, we obtain central limit theorems to characterize the fluctuations of the empirical measures around the limit measures by using stochastic calculus. As applications, central limit theorems for the Dysons Brownian motion and the eigenvalues of the Wishart process are recovered under slightly more general initial conditions, and a central limit theorem for the eigenvalues of a symmetric Ornstein-Uhlenbeck matrix process is obtained.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا