No Arabic abstract
We investigate eigenvectors of rank-one deformations of random matrices $boldsymbol B = boldsymbol A + theta boldsymbol {uu}^*$ in which $boldsymbol A in mathbb R^{N times N}$ is a Wigner real symmetric random matrix, $theta in mathbb R^+$, and $boldsymbol u$ is uniformly distributed on the unit sphere. It is well known that for $theta > 1$ the eigenvector associated with the largest eigenvalue of $boldsymbol B$ closely estimates $boldsymbol u$ asymptotically, while for $theta < 1$ the eigenvectors of $boldsymbol B$ are uninformative about $boldsymbol u$. We examine $mathcal O(frac{1}{N})$ correlation of eigenvectors with $boldsymbol u$ before phase transition and show that eigenvectors with larger eigenvalue exhibit stronger alignment with deforming vector through an explicit inverse law. This distribution function will be shown to be the ordinary generating function of Chebyshev polynomials of second kind. These polynomials form an orthogonal set with respect to the semicircle weighting function. This law is an increasing function in the support of semicircle law for eigenvalues $(-2: ,+2)$. Therefore, most of energy of the unknown deforming vector is concentrated in a $cN$-dimensional ($c<1$) known subspace of $boldsymbol B$. We use a combinatorial approach to prove the result.
We extend the random characteristics approach to Wigner matrices whose entries are not required to have a normal distribution. As an application, we give a simple and fully dynamical proof of the weak local semicircle law in the bulk.
Positive definite (p.d.) matrices arise naturally in many areas within mathematics and also feature extensively in scientific applications. In modern high-dimensional applications, a common approach to finding sparse positive definite matrices is to threshold their small off-diagonal elements. This thresholding, sometimes referred to as hard-thresholding, sets small elements to zero. Thresholding has the attractive property that the resulting matrices are sparse, and are thus easier to interpret and work with. In many applications, it is often required, and thus implicitly assumed, that thresholded matrices retain positive definiteness. In this paper we formally investigate the algebraic properties of p.d. matrices which are thresholded. We demonstrate that for positive definiteness to be preserved, the pattern of elements to be set to zero has to necessarily correspond to a graph which is a union of disconnected complete components. This result rigorously demonstrates that, except in special cases, positive definiteness can be easily lost. We then proceed to demonstrate that the class of diagonally dominant matrices is not maximal in terms of retaining positive definiteness when thresholded. Consequently, we derive characterizations of matrices which retain positive definiteness when thresholded with respect to important classes of graphs. In particular, we demonstrate that retaining positive definiteness upon thresholding is governed by complex algebraic conditions.
Blind source separation (BSS) is a signal processing tool, which is widely used in various fields. Examples include biomedical signal separation, brain imaging and economic time series applications. In BSS, one assumes that the observed $p$ time series are linear combinations of $p$ latent uncorrelated weakly stationary time series. The aim is then to find an estimate for an unmixing matrix, which transforms the observed time series back to uncorrelated latent time series. In SOBI (Second Order Blind Identification) joint diagonalization of the covariance matrix and autocovariance matrices with several lags is used to estimate the unmixing matrix. The rows of an unmixing matrix can be derived either one by one (deflation-based approach) or simultaneously (symmetric approach). The latter of these approaches is well-known especially in signal processing literature, however, the rigorous analysis of its statistical properties has been missing so far. In this paper, we fill this gap and investigate the statistical properties of the symmetric SOBI estimate in detail and find its limiting distribution under general conditions. The asymptotical efficiencies of symmetric SOBI estimate are compared to those of recently introduced deflation-based SOBI estimate under general multivariate MA$(infty)$ processes. The theory is illustrated by some finite-sample simulation studies as well as a real EEG data example.
In this paper, we study the asymptotic behavior of the extreme eigenvalues and eigenvectors of the high dimensional spiked sample covariance matrices, in the supercritical case when a reliable detection of spikes is possible. Especially, we derive the joint distribution of the extreme eigenvalues and the generalized components of the associated eigenvectors, i.e., the projections of the eigenvectors onto arbitrary given direction, assuming that the dimension and sample size are comparably large. In general, the joint distribution is given in terms of linear combinations of finitely many Gaussian and Chi-square variables, with parameters depending on the projection direction and the spikes. Our assumption on the spikes is fully general. First, the strengths of spikes are only required to be slightly above the critical threshold and no upper bound on the strengths is needed. Second, multiple spikes, i.e., spikes with the same strength, are allowed. Third, no structural assumption is imposed on the spikes. Thanks to the general setting, we can then apply the results to various high dimensional statistical hypothesis testing problems involving both the eigenvalues and eigenvectors. Specifically, we propose accurate and powerful statistics to conduct hypothesis testing on the principal components. These statistics are data-dependent and adaptive to the underlying true spikes. Numerical simulations also confirm the accuracy and powerfulness of our proposed statistics and illustrate significantly better performance compared to the existing methods in the literature. Especially, our methods are accurate and powerful even when either the spikes are small or the dimension is large.
The concordance signature of a multivariate continuous distribution is the vector of concordance probabilities for margins of all orders; it underlies the bivariate and multivariate Kendalls tau measure of concordance. It is shown that every attainable concordance signature is equal to the concordance signature of a unique mixture of the extremal copulas, that is the copulas with extremal correlation matrices consisting exclusively of 1s and -1s. This result establishes that the set of attainable Kendall rank correlation matrices of multivariate continuous distributions in arbitrary dimension is the set of convex combinations of extremal correlation matrices, a set known as the cut polytope. A methodology for testing the attainability of concordance signatures using linear optimization and convex analysis is provided. The elliptical copulas are shown to yield a strict subset of the attainable concordance signatures as well as a strict subset of the attainable Kendall rank correlation matrices; the Student t copula is seen to converge to a mixture of extremal copulas sharing its concordance signature with all elliptical distributions that have the same correlation matrix. A method of estimating an attainable concordance signature from data is derived and shown to correspond to using standard estimates of Kendalls tau in the absence of ties. The methodology has application to Monte Carlo simulations of dependent random variables as well as expert elicitation of consistent systems of Kendalls tau dependence measures.