Do you want to publish a course? Click here

New asymptotic results in principal component analysis

144   0   0.0 ( 0 )
 Added by Karim Lounici
 Publication date 2016
and research's language is English




Ask ChatGPT about the research

Let $X$ be a mean zero Gaussian random vector in a separable Hilbert space ${mathbb H}$ with covariance operator $Sigma:={mathbb E}(Xotimes X).$ Let $Sigma=sum_{rgeq 1}mu_r P_r$ be the spectral decomposition of $Sigma$ with distinct eigenvalues $mu_1>mu_2> dots$ and the corresponding spectral projectors $P_1, P_2, dots.$ Given a sample $X_1,dots, X_n$ of size $n$ of i.i.d. copies of $X,$ the sample covariance operator is defined as $hat Sigma_n := n^{-1}sum_{j=1}^n X_jotimes X_j.$ The main goal of principal component analysis is to estimate spectral projectors $P_1, P_2, dots$ by their empirical counterparts $hat P_1, hat P_2, dots$ properly defined in terms of spectral decomposition of the sample covariance operator $hat Sigma_n.$ The aim of this paper is to study asymptotic distributions of important statistics related to this problem, in particular, of statistic $|hat P_r-P_r|_2^2,$ where $|cdot|_2^2$ is the squared Hilbert--Schmidt norm. This is done in a high-complexity asymptotic framework in which the so called effective rank ${bf r}(Sigma):=frac{{rm tr}(Sigma)}{|Sigma|_{infty}}$ (${rm tr}(cdot)$ being the trace and $|cdot|_{infty}$ being the operator norm) of the true covariance $Sigma$ is becoming large simultaneously with the sample size $n,$ but ${bf r}(Sigma)=o(n)$ as $ntoinfty.$ In this setting, we prove that, in the case of one-dimensional spectral projector $P_r,$ the properly centered and normalized statistic $|hat P_r-P_r|_2^2$ with {it data-dependent} centering and normalization converges in distribution to a Cauchy type limit. The proofs of this and other related results rely on perturbation analysis and Gaussian concentration.



rate research

Read More

Fan et al. [$mathit{Annals}$ $mathit{of}$ $mathit{Statistics}$ $textbf{47}$(6) (2019) 3009-3031] proposed a distributed principal component analysis (PCA) algorithm to significantly reduce the communication cost between multiple servers. In this paper, we robustify their distributed algorithm by using robust covariance matrix estimators respectively proposed by Minsker [$mathit{Annals}$ $mathit{of}$ $mathit{Statistics}$ $textbf{46}$(6A) (2018) 2871-2903] and Ke et al. [$mathit{Statistical}$ $mathit{Science}$ $textbf{34}$(3) (2019) 454-471] instead of the sample covariance matrix. We extend the deviation bound of robust covariance estimators with bounded fourth moments to the case of the heavy-tailed distribution under only bounded $2+epsilon$ moments assumption. The theoretical results show that after the shrinkage or truncation treatment for the sample covariance matrix, the statistical error rate of the final estimator produced by the robust algorithm is the same as that of sub-Gaussian tails, when $epsilon geq 2$ and the sampling distribution is symmetric innovation. While $2 > epsilon >0$, the rate with respect to the sample size of each server is slower than that of the bounded fourth moment assumption. Extensive numerical results support the theoretical analysis, and indicate that the algorithm performs better than the original distributed algorithm and is robust to heavy-tailed data and outliers.
We consider the sparse principal component analysis for high-dimensional stationary processes. The standard principal component analysis performs poorly when the dimension of the process is large. We establish the oracle inequalities for penalized principal component estimators for the processes including heavy-tailed time series. The rate of convergence of the estimators is established. We also elucidate the theoretical rate for choosing the tuning parameter in penalized estimators. The performance of the sparse principal component analysis is demonstrated by numerical simulations. The utility of the sparse principal component analysis for time series data is exemplified by the application to average temperature data.
Functional data analysis on nonlinear manifolds has drawn recent interest. Sphere-valued functional data, which are encountered for example as movement trajectories on the surface of the earth, are an important special case. We consider an intrinsic principal component analysis for smooth Riemannian manifold-valued functional data and study its asymptotic properties. Riemannian functional principal component analysis (RFPCA) is carried out by first mapping the manifold-valued data through Riemannian logarithm maps to tangent spaces around the time-varying Frechet mean function, and then performing a classical multivariate functional principal component analysis on the linear tangent spaces. Representations of the Riemannian manifold-valued functions and the eigenfunctions on the original manifold are then obtained with exponential maps. The tangent-space approximation through functional principal component analysis is shown to be well-behaved in terms of controlling the residual variation if the Riemannian manifold has nonnegative curvature. Specifically, we derive a central limit theorem for the mean function, as well as root-$n$ uniform convergence rates for other model components, including the covariance function, eigenfunctions, and functional principal component scores. Our applications include a novel framework for the analysis of longitudinal compositional data, achieved by mapping longitudinal compositional data to trajectories on the sphere, illustrated with longitudinal fruit fly behavior patterns. RFPCA is shown to be superior in terms of trajectory recovery in comparison to an unrestricted functional principal component analysis in applications and simulations and is also found to produce principal component scores that are better predictors for classification compared to traditional functional functional principal component scores.
199 - Yehua Li , Tailen Hsing 2012
We consider nonparametric estimation of the mean and covariance functions for functional/longitudinal data. Strong uniform convergence rates are developed for estimators that are local-linear smoothers. Our results are obtained in a unified framework in which the number of observations within each curve/cluster can be of any rate relative to the sample size. We show that the convergence rates for the procedures depend on both the number of sample curves and the number of observations on each curve. For sparse functional data, these rates are equivalent to the optimal rates in nonparametric regression. For dense functional data, root-n rates of convergence can be achieved with proper choices of bandwidths. We further derive almost sure rates of convergence for principal component analysis using the estimated covariance function. The results are illustrated with simulation studies.
114 - Aude Sportisse 2019
Missing Not At Random (MNAR) values lead to significant biases in the data, since the probability of missingness depends on the unobserved values.They are not ignorable in the sense that they often require defining a model for the missing data mechanism, which makes inference or imputation tasks more complex. Furthermore, this implies a strong textit{a priori} on the parametric form of the distribution.However, some works have obtained guarantees on the estimation of parameters in the presence of MNAR data, without specifying the distribution of missing data citep{mohan2018estimation, tang2003analysis}. This is very useful in practice, but is limited to simple cases such as self-masked MNAR values in data generated according to linear regression models.We continue this line of research, but extend it to a more general MNAR mechanism, in a more general model of the probabilistic principal component analysis (PPCA), textit{i.e.}, a low-rank model with random effects. We prove identifiability of the PPCA parameters. We then propose an estimation of the loading coefficients and a data imputation method. They are based on estimators of means, variances and covariances of missing variables, for which consistency is discussed. These estimators have the great advantage of being calculated using only the observed data, leveraging the underlying low-rank structure of the data. We illustrate the relevance of the method with numerical experiments on synthetic data and also on real data collected from a medical register.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا